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SIGNIFICANCE TESTS IN DISCRETE DISTRIBUTIONS 


H. O. LANcastTER* 
University of Sydney 


In discrete distributions, it has often been recommended that an aux- 
iliary random experiment should be carried out so as to make the size of 
the test equal to the significance level. In certain experimental situa- 
tions, this procedure would be time-consuming and even embarrassing 
to the statistician. An alternative criterion of significance is suggested, 
the mid or median probability. To avoid computations P(x), the prob- 
ability of the square root of a chi-square variable, can be used in the 
binomial, Poisson and hypergeometric distributions as an approxima- 
tion to the mid probability. In statistical control of counting experi- 
ments or where experiments are being repeated P(x) rather than P(x-), 
the probability of x corrected for continuity, gives acceptable approxi- 
mations to size of significance levels. Some computations are given for 
the multinomial distribution which show that here P(x*) gives accept- 
able approximations. 


1, INTRODUCTION 


N DISCRETE distributions, the cumulative sum of the probability of the obser- 
| vation itself and all more extreme has been used as the test function in the 
“exact test.” If this “exact probability” is less than the level of significance, 
the observation is said to be significant or lie in the critical region. As a rule, 
there is a difference between the size of such a test and the significance level. 
Fisher [6] and elsewhere has maintained that the size must not exceed the 
significance level, whereas the Neyman-Pearson procedure is to equalize size 
and significance level by an auxiliary random experiment, for a description of 
which we may refer the reader to the review of Pearson [10]. In Section 4 we 
show that there are serious difficulties to be faced if this test is applied in 
practice. As an alternative procedure we introduce the mid- or median prob- 
ability as a test function and show that it is plausible to consider it as the result 
of a randomization procedure carried out before the experiment. Its use can 
thus be reconciled with the Neyman-Pearson theory. In any of these three 
methods, there is a good deal of computation. In the commonly met discrete 
distributions, this can be avoided by the use of x or x? approximation. The 
exact probability corresponds closely with P(x-), the probability of x, corrected 
for continuity. The mid-probability is closely approximated by P(x) as a rule. 
We consider also the multinomials of the form, (0.2+0.3+0.5)*, and show 
that P(x?) gives a reasonable approximation of size to the significance levels. 





* Formerly Associate Professor in Medical Statistics at the School of Public Health and Tropical Medicine, 
University of Sydney, Australia. 
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Finally we lay down rules for criteria of significance and conclude that in many 
common situations, P(x) or P(x*) have desirable features as a test function. 


2. DISCRETE DISTRIBUTIONS 


In diserete distributions, a random variable takes on finitely or denumerably 
many values {a;} with probabilities ‘pi, i=0,1,2--- ;>°>p;=1. There is no 
loss of generality in considering some transformation so that the random vari- 
able takes only non-negative integral values and can be written as 7. i takes the 
values 0, 1, 2 - - - nm in the binomial and hypergeometric distributions but de- 
numerably many integral values in the Poisson. In such cases the distribution 
is usually modified by taking the union of all except a finite number of events 
as a single event. We suppose that this has been done. Formally, we are dealing 
with one-sided tests. We define 

Pi=Lir, P'@O=PG+)= Lp. (2.1) 

j2t j>t 

So defined P(7) is the probability of the exact test and may be referred to as 
the exact probability. If there are (n+1) values of 7, then P(0)=1, P,=p, and 
P’(n) =0. Let a be an arbitrary number, 0<a<1, chosen as the level of signif- 
icance. Then the critical region for the exact test consists of all those values of 
i for which P(i) <a. There is always a special or marginal value of 7, I say, for 
which the inequality holds, P(I) 2a>P(I+1). For this marginal event, i=/, 
we define 


fa — PI + 1)}/{P() — PU +2} 
{a — P(I + 1)}/pr. 


For a fixed distribution, for example, with {p;} given by the terms of the 
binomial 


(2.2) 


n 
pi = b(i| n,q) = (")aa ~ or, (2.8) 


6 is not a random variable but with a mixture of binomials where n or q or both 
are random variables, @ will be a random variable. Now no 4d priori distributions 
can be postulated for n and g but a number of cases may be examined to see 
whether there is any tendency for @ to have special values. To this end, a fixed 
value of a, namely, 0.05 has been chosen; and we have given equal weight to the 
ten values of n=40(1)49 and to the forty values of q=0.31(0.01).069 with 
0.50 counted twice. The results are detailed in Table I. The x? of goodness of fit 
is 26.7 with 19 degrees of freedom and so it is a plausible guess that in discrete 
distributions treated by the statisticians the set of @’s will behave as though 
6 were rectangular in the interval (0, 1). In some physical contexts, there will 
be a randomising process; in taking parallel counts of blood cells or bacterial 
colonies, the set total 


N=21+ 22+ -+++2, (2.4) 
will be a random variable, under the null hypothesis a Poisson variable with 


parameter, kA, where A is the parameter of the Poisson distribution for an in- 
dividual count. Similar considerations hold for the fourfold tables. 
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TABLE I. THE VALUES OF @ IN 400 BINOMIAL DISTRIBUTIONS 
WITH THE SIGNIFICANCE LEVEL, a=0.05 

















Value of 6 Frequency Value of 6 Frequency 
0.00— 16 0.50— 19 
0.05— 18 0.55— 22 
0.10— 12 0 .60— 15 
0.15— 16 0 .65— 24 
0.20— 15 0.70— 20 
0.25— 22 0.75— 24 
0 .30— 28 0.80— 34 
0.35— 13 0.85— 21 
0.40— 21 0.90— 24 
0 .45— 16 0.95— 20 














“0.05 —” means 0.05 <6 <0.10. 
p =0.31(0.01)69, 0.50 counted twice 
N =40(1)49. 


3. THE SIZE OF THE TESTS IN DISCRETE DISTRIBUTION 


The size of the exact probability test is never greater than the significance 
level, by definition. If a table of the binomial distributions is examined, it is 
easily verified that this results in great loss of power. In fact, if the null hy- 
pothesis specifies the value of the binomial index and the value of ¢, the power 
of the test may be less than a for moderate values of the difference between q 
of the null distribution and q’ of the alternative. 


Example 

(i) a=0.05, n= 20, g=0.50; P(15) = 0.0207, P(14) = 0.0577 
Power of the test for g’ = 0.54 is 0.0461. 

(ii) a = 0.05, n= 22, g= 0.50; P(16) = 0.0262, P(15) = 0.0669 


Power of the test for g’ = 0.53 is 0.0486. 

Most authors are agreed that in the null case the (effective) rejection rate 
or size of the test should be equal to the (nominal) significance level, when the 
distribution is continuous. For discontinuous distributions, there is not the 
same degree of agreement. Fisher [6], indeed, criticizes authors for “laying 
down axiomatically, what is not agreed or generally true, that the level of 
significance must be equal to the frequency with which the hypothesis is re- 
jected in repeated sampling of any fixed population allowed by hypothesis. 
This intrusive axiom ... seems to be a real bar to progress.” On the other 
hand, even if this axiom is not admitted, it is essential to have a test which will 
assign the correct proportions of sets to the various probability classes in the 
null case, for example, when one is using the technique of statistical control of 
counting experiments, (introduced by Fisher, Thornton and Mackenzie [7]). 
The most convenient form for routine use is given in their Table 19. They are 
considering three parallel counts of bacterial colonies, x1, #2 and x3. Under ideal 
conditions, x1, 22 and x; can be considered to be a sample of three independent 
drawings from a Poisson distribution with unknown parameter, A. If only 
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variation between 21, 22 and 2; for a fixed sample total, or the joint distribution 
of 21, 22 and x; conditional on x,;+22+2;=N, is considered, then Fisher, Thorn- 
ton and Mackenzie [7] show that the test criterion, 


x= x (x; — 2)*/2, n= = > Zj, (3.1) 


is distributed approximately as x? with 2 d.f. P(x’) is then used to classify the 
counts into probability classes, with points of division corresponding to con- 
venient significance levels of x”, say 0.1(0.2)0.9. Their test of consistency be- 
tween counts finally resolves itself into a test of goodness of fit of the observed 
assignments in the probability classes to the theoretical. The justification of 
this procedure can only be that the discontinuous x? approximates in distribu- 
tion to that of the continuous x? and that a 100a% significance test will reject 
100a% of sets and this argument is used by Fisher, Thornton and Mackenzie 
[7] on their pages 335 and succeeding. 


4. THE METHOD OF THE AUXILIARY RANDOM EXPERIMENT 


The possibility of having an experimental outcome, i=J, where P(I) >a> 
P’(I) = P(I+1) is considered in tie Neyman-Pearson theory and an auxiliary 
randomisation procedure is recommended to make the size of the test equal to 
e«. An auxiliary sampling experiment is carried out after the experiment has 
been performed using random sampling numbers and a proportion, 


fa — P'(D}/{PW) — PD} 


of the event, i=J, rejected. This method gives a size equal to the significance 
level, but at a price. Given the same experimental data, the statistician will 
not always give the same result even using a test of his own choice. Another 
form of inconsistency may be considered. Let the null hypothesis specify that 
q=0.8 and N=20 in the binomial, {g+(1—g¢)}*. P(20)=0.01153 and P(19) 
= (0.06918. An auxiliary random experiment is carried out and the event,i=19, 
is found to be significant at the significance level, a=0.05. In the same report, 
there might be an experiment with N=21, q=0.8 and 1=20. P(21) =0.00922 
and P(20) =0.05765 and the auxiliary random experiment decides that i= 20 
is not significantly different from the expected. Together these two results mean 
that the statistician has decided that 19/20 =0.950 is significantly greater than 
0.8 but that 20/21 =0.953 is not. Comparing the experiments, we find that the 
difference between them was a single additional observation which was in 
favour of the hypothesis being rejected but which had the opposite effect. 

The computations for the auxiliary randomization would be found to be 
rather irksome. For example, a bacteriologist may be carrying out two parallel 
plate counts at a technically convenient density, say a mean number of 25 
colonies per plate. He is using the boundaries 0.1(0.2)0.9 for his probability 
classes in the manner explained in Section 3. With an observed combined count 
of 35, only a difference of 5 or of more than 11 will not need to have an auxiliary 
random experiment so that the auxiliary experiment will be carried out in 
79.4% of counts under a true null hypothesis. With a difference |2;—22| =3, 
the auxiliary random experiment may refer the observation to any one of the 
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classes, 0.9 to 1.0, 0.7 to 0.9 or 0.5 to 0.7. With a combined count of 36, only 
differences |z,—z2| of 6 or of more than 12 will escape the need for a random 
experiment, which will be needed in 80.9% of counts. With a combined count of 
37, 90.1% of counts will require a random experiment. Suppose that a bacteriol- 
ogist carrying out statistical control were exhibiting such results. An intelligent 
biologist present might ask, in what proportion of cases is the assignment to a 
probability class due to the auxiliary random experiment. It will surely weaken 
the statistician’s position if he is compelled to admit that the assignment was 
due to the auxiliary random experiment in over 60% of all sets. 


5. THE MEDIAN PROBABILITY AS A TEST FUNCTION 


The median probability, or perhaps more appropriately the mid-probability, 
defined in Lancaster [8], by 


P,.(i) = 4{P@ + PG+)} =#{PO+P'@} (5.1) 


may be considered as a test function for single experiments, with a rule of re- 
jection. 
P,, (i) < a, (5.2) 


whereas the rule of rejection with the exact probability is to reject when 
P(t) <a. For a given hypothesis and experimental result it will always give the 
same answer. A comparison of (5.1) and (2.2) shows that the rule of rejection 
(5.2) applied to the marginal event, J, is equivalent to a rule of rejection, 
when @> 3. 

Let a test be defined for the marginal event so that the null hypothesis is re- 
jected when @>6, and let us determine how often it would agree with the 
auxiliary random sampling method in a mixture of populations such that @ is a 
rectangularly distributed variable in the range, zero to unity. The agreement 
will occur in a proportion, 


60 1 
f (1 — 0)d0 + ({ 0d0 = 0.75 — (0.5 — )2, (5.3) 
0 6 
under these idealised conditions. For with 6<9, the test will accept the null 
hypothesis and the auxiliary sampling method will accept in a proportion 
(1—6). Similarly for @>6o, the auxiliary sampling method test will reject the 
null hypothesis in a proportion, @. So that, integrating over the postulated rec- 
tangular distribution of 6, the two tests would agree in about 75% of cases. 
Of all such tests with choice of , the median probability obtained by taking 
6) to be 0.5 will agree most often with the auxiliary random sampling method. 
Wallis [11], on his page 245, uses, in effect, the geometric mean of P(I) 
and P(I+1) as his criterion, which procedure also allows the size of the test to 
exceed the significance level and gives a rejection rate higher than that given 
by the median probability, since the geometric mean is less than the arithmetic, 
that is, 


ViP(DPU +1} < PU) + PU + 0}. (5.4) 
The results of the two methods will usually agree in marginal case. 
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Both the random auxiliary experiment and the median probability tests 
may require rather lengthy computations; however, P(x”) or P(x) is shown in 
the next section to be a good approximation to the median probability. 

The conclusion of this section is that in cases likely to be met by the statis- 
tician, the mid-probability test will agree with the Neyman-Pearson procedure 
in about 75% of the marginal cases and in all other cases. 


6. NORMAL APPROXIMATIONS TO THE PROBABILITIES 


Of the discrete distributions arising in practice, the binomial, the Poisson 
and the hypergeometric are the most commonly met. The approximations are 
now considered in detail for the binomial but the treatment is applicable to 
the other two. The possible events are i=0, 1 - - - n. By Stirling’s approxima- 
tion to the factorial or by other means, there is obtained 


{ -3(i — ng)?/o°}, }= 0,1,2,---n, (6.1) 


where o?=nqg(1—q). It seems natural to equate these values to the areas in a 
histogram, where rectangles of the heights, p,, given in (6.1) are erected on the 
base (t{—4, i+4). The rectangular areas are then approximately equal to the 
area under the normal curve with mean, ng, and variance, o*. Yates [12] noted 
that the correct area under the normal curve to be equated to the exact prob- 
ability, P(<), would correspond to the area from + © to i— 4} and not toi. Yates 
[12] wrote 


Xe = (¢ — § — ng)/e. (6.2) 


The corresponding probability, P(x-), corresponds closely to the exact prob- 
ability in any reasonable cases examined. We may write x-() for the value of 
x- at the point 7 and note that 


P(xe(t)) = P@) 
P(xe(¢ + 1)) ~ Pi + 1). 
The crude or uncorrected x may be defined by 


x(t) = (§ — ng)/V{ng(1 — g)}. (6.4) 


(6.3) 


It is evident that 


x(t) = 4{x0(t) + xe( + 1)}. (6.5) 


To what probability does P(x(i)) approximate? It seems reasonable to take a 
simple linear interpolation and write 


P(x(i)) & 4{ P(x) + P(xelé + 1))} 
= 4{P(@) + P@+ 1} = Pi). 
The P(x(7)) can thus be expected to be not greatly different from the median 
probability. To see this verified empirically let us find the size of the critical 


regions defined by these two tests for the two-sided test on the binomials, 
($+4)*%, N =26(1)120. At the 1% level, they agree in every case and the aver- 


(6.6) 
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age size is 0.96%. At the 5% level they disagree for N = 26, 31, 74, 84 and 95. 
The average sizes are 5.02% and 4.84% for the uncorrected P(x) test and the 
median probability test, respectively. At the 10% level, they disagree for 
N =72, 83 and 94. The average sizes are 9.99% and 9.85%. At the 30% level, 
there is disagreement for N=44 and 74. The average sizes are 30.08% and 
29.82%. At 50% there is one disagreement for N = 106 and the average sizes are 
50.10% and 49.97%. At the 70% level there is a disagreement for N = 26 and 
the average sizes are 69.77% and 69.47%. At the 90% level there is one dis- 
agreement for N=63, and the average sizes are 90.06% and 89.85%. In all 
we have 570 tests with 13 disagreements, This is only a probable result if the 
P(x) is usually close to the median probability. 

Another test of the use of P(x) to approximate to the median probability is 
to see how often these tests disagree for the marginal event at the 5% level in 
the binomials with g=0.20(0.01)0.50 and n= 40(1)49 a total of 620 significance 
tests. In 40 cases out of the 620, the P(x) test rejected the hypothesis and the 
median probability did not—a disagreement in 6.45% of the marginal cases. 

It can be seen that there will be such disagreement because the interpolation 
between x.(i) and x.(i+1) will not be strictly linear. Average values for the 
doubtful event, J, in this series might de taken as an n of 45 and a q of 0.36 
and so an average value of {P(I)—P(I+1)} =pr would be 


{Qanq(1 — q)}~!/? exp — }(1.645)? = 0.10311/3.200 = 0.0322. 


In other words an average sort of finding is P(J) =0.0661, P(I+1) =0.0339. 
The corresponding normal deviates are 1.506 and 1.825, the mid-point of this 
range, corresponding to a mean-value of x is 1.666 and P(1.666) is 0.0486. 
P(x) will be lower than the median probability. The P(x) test will correspond 
to a value 4 of (0.0486 —0,0339) /0.0322 =0.457. The median probability will 
reject the event, J, in about 50% of cases, the P(x) test will do so in about 
54.3% of cases. There will be disagreement in about 4.3% of cases. On the other 
hand, the P(x) test will agree in over 74% of cases with the auxiliary random 
sampling experiment by formula (5.3) and so would be justified alternatively 
as giving a greater agreement with the Neyman-Pearson procedure than any 
other fixed rule of the form, reject if 6>6o. It should be noted that at the 5% 
level there is only disagreement in say 4% of the marginal cases with J and 
with all other values of 7, there will be agreement. 

The good approximation of P(x) to the median probability is in accord with 
intutition. The approximation (6.6) does not lead to a definite statement on 
direction. The observations in the binomial may be ordered by numbers of 
successes or by number of failures. It is quite unreasonable to suppose that 
P(x) would approximate to the exact probability because if it did so for 
successes it certainly would not give a good approximation for failures. The 
notion that it should approximate to the median probability introduces 
symmetry into the considerations. 

If it be accepted that the best test criterion is the median probability, the 
conclusions of this section lead us to assert that the most satisfactory practical 
test of significance in the binomial, Poisson and hypergeometric distributions 
is the crude x. The crude x always attaches the same probability to the same 
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experimental data, it is easy to compute and gives approximately the correct 
size for the critical region. It should be stressed, however, that this is only so 
if the particular distribution given by the { p,;} can be regarded as chosen ran- 
domly from an appropriate class. These conditions are often fulfilled in a 
natural sort of way in experimental work; in control of counting, N, the total 
of counts within the sample, is a random variable from a Poisson, whose 
parameter is again a random variable. For a fourfold table, attention may be 
restricted to the case where the choice of rows has been made arbitrarily. The 
first column total is then a random variable and so a number of hypergeometric 
distributions are generated. a., is a binomial variable in fact with unknown 
parameter, g, and index, a=a;,+<a2.. The distribution of x is then a mixture of 
the distributions, 


F,(x) = F(x'| a;.,@.. =z), namely (6.7) 
F(x) = & Pe| a, a)F.(x). (6.8) 


At any given level, a, for a fixed z, the size of the critical region may be too 
large or too small but the averaging out effect of (6.8) will make the size ac- 
ceptably close to the theoretical. This will be even more the case if the model 
used does not fix @,, and a2. before the experiment; with this model, a second 
summation over possible row totals would be introduced into (6.8). This 
natural form of randomisation may not be available. For example, the q of the 
binomial may be given by hypothesis; a convenient range for N may then be 
chosen and a random sampling experiment carried out to determine which N 
to use. Although this randomization appears artificial, it may be regarded 
as not inconsistent with the argument of pages 52 and 53 of Fisher [5]. 


7. x? IN THE MULTINOMIALS, (0.5+0.3+0.2)¥ 


These multinomials have been used by previous investigators because of the 
relative ease of computation, for the three numbers provide the only solution 
of the partition of unity into three pairwise different one-place decimals, no 
one of which is too small. The outcome of a multinomial experiment is an 
ordered set of numbers, or configuration. The relative frequency of each con- 
figuration is a term of the multinomial. The terms are most conveniently 
calculated as a product of terms from the binomials (0.5+0.5)"” and 
(0.6+0.4)"*, the values of which are tabulated in Eisenhart [3]. The 2; 
can be plotted on a triangular grid as homogeneous co-ordinates and the 
probabilities written alongside the points representing the configurations, 
(x1, %2, 3). Noting also that 


3Nx° = 621 + 102 + 1523 — 3N’, (7.1) 


the contours for the different percentage points of x? can be marked out and the 
percentage assignment to each probability class obtained by summation in the 
different regions. This percentage is the calculated or effective assignment. 
Previous authors, El Shanawany [4] and Neyman and Pearson [9] have com- 
pared the assignments using x? with the results of the exact test the events 
being ordered by the probability and not by their value of x*. The procedure 
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TABLE II. SIZES OF THE x? TEST IN THE MULTINOMIAL 
DISTRIBUTIONS, (0.2+0.3+0.5)¥ 




















The Percentage Rejections of the True Null Hypothesis, 

N with a Nominal Significance Level of—- 
1% 5% 10% 30% 50% 70% 90% 
9 0.82 4.22 8.02 33 .66 45.71 74.49 91.50 
10 0.96 5.02 9.03 31.34 53.22 $5.27 91.50 
11 0.86 3.68 9.67 29 .23 47.88 78.95 92.20 
12 0.69 4.47 10.20 31.81 48.61 74.45 92.98 
13 1.38 5.28 8.37 27.70 47.03 70.71 87.40 
14 0.65 4.51 10.54 31.09 50.07 68.31 93 .92 
15 0.94 4.74 8.67 25.85 54.59 70.55 88.83 
16 1.00 4.42 8.79 30.48 50.93 71.58 94.53 
17 0.90 4.30 11.33 36.21 50.31 73.19 90.18 
18 0.83 4.36 8.87 28.12 49.21 71.04 90.70 
19 0.88 5.06 10.47 33 .02 49.13 68.81 86.74 
20 0.78 4.75 10.04 29.13 51.02 72.90 95.58 
21 0.94 4.83 10.03 30.98 52.78 67.90 88.09 
22 0.87 5.33 10.62 29 .87 53 .98 75.25 84.94 
23 0.83 4.04 9.84 29 .00 48 .96 73.57 88 .97 
24 0.96 5.29 11.31 30.74 50.77 71.36 93.05 
25 0.92 4.67 10.05 29 .07 44.81 69.55 93.05 
26 0.96 4.53 9.87 32.45 51.42 70.74 87.28 
27 0.79 5.11 9.54 30.35 52.79 69.39 90.52 
28 1.10 4.62 9.98 30.86 49.31 68.16 87.85 
29 0.85 4.68 10.05 30.09 48.59 73.35 91.05 
30 0.89 4.64 9.20 30.45 49 .87 67.74 91.53 
31 1.14 4.52 9.83 30.71 50.77 72.60 91.69 
91.89P 0.92 4.59 9.19 27.57 45.95 64.32 82.70 





The percentage of rejections, due to the set with the x? closest to the 100P% point, is equal approximately to 
91.89 P/N. 


here is to order the various configurations (x1, x2, 23) according to the x* value 
and to compare the cumulative functions of the discrete distributions with the 
theoretical at arbitrary points, 0.01, 0.05, 0.1(0.2)0.9, 1.0. The results are given 
in Tables II and III for N = 9(1)31. In Table II, we find that for normal signif- 
icance levels of 0.9, 0.7, 0.5, 0.3 and 0.1, the calculated frequencies of assign- 
ment or sizes at each level are quite close to the theoretical. The calculated and 
theoretical values are still reasonably close even at the 5% level and the cal- 
culated proportion is usually below the theoretical. At the 1% level there is 
greater relative differences between the calculated and theoretical but since 
the values for the calculated vary only from 0.65 to 1.38, they too may be 
regarded as not unreasonable approximations. And yet, these are much lower 
values of N than will usually be met in practice. 

Table III shows that although the total number of configurations is equal to 
3(N+1)(N+2), the numbers of configurations in the range, 0.1 to 1.0, in- 
crease much more slowly, in fact linearly as N. A neighbourhood of a point, 
0.8 for example, may be considered. Then a single configuration will be repre- 
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TABLE III. THE NUMBER OF CONFIGURATIONS ASSIGNABLE TO THE 
PROBABILITY CLASSES BY THE x? TEST IN THE 
MULTINOMIALS, (0.5+0.3 +0.2)%, 








The Probability Class 
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2 Values calculated from the formula, 2*N(0.03)'/? loge (#:/m), are given in the columns under C; mw, and m 
are the upper and lower boundaries of the probability classes; numerical values of the multiplier of N are given in 
the last line of the table. 
sented by a frequency proportional approximately to (0.2X0.3X0.5)— 

¢ T\— » 2 + om 9 . ° “° 
(2xN)- exp —4xé.s, Where xo. is the x? corresponding to the 0.8 probability 
level. The density of configurations in the neighbourhood of the 0.8 point will 
be the inverse of this and so the number of configuration assigned to any class 
in this neighbourhood will increase linearly with N. 

Owing to the special relation between x? for two degrees of freedom and the 


corresponding P(x”), namely 


P(x?) = 2f exp — $x°dx? = exp — $x? (7.2) 
x 


a simple analytic form can be derived for, », the number of configurations in a 
given probability class, (71, 72). 


e dP(x") 
v(x, N) = (0:08) 2a fo (7.3) 
=, CXp — $x? 


v2 
(0.03)*/? 24N f P-'dP 


CN log, (2/71), 
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where c is a constant, 27(0.03)"/?= 1.08828. 

This rule is quite effective as can be seen by comparing observed and calcu- 
lated numbers of the configuration assigned to the probability classes, the 
numerical values of c log, 72/7, are shown at the foot of the columns of Table 
III and the calculated numbers, cN log, 72/7, are given in the main body of 
the tables. 

What is reasonable agreement between the calculated and theoretical values 
in Table I1? Differences up to at least the value of the frequency of a single 
configuration (21, £2, 23) in the neighbourhood of the probability level being 
considered must be expected, that is, differences of the order of 


(0.03) *(2"N)~ exp — }xp = 0.9189P/N. (7.4) 


0.9189 P is tabulated at the foot of Table IT. It will be found that comparatively 
few of the differences of the calculated from the theoretical in Table ITI are 
greater than 0.9189 P/N and very few greater than twice this number. This 
section illustrates the unexpectedly accurate assignment by P(x”) of the dif- 
ferent configurations (x1, %2, x3) to the probability classes. This conclusion is 
consistent with those of Cochran [1] and [2]. 

Results, rather less favourable than these, would be obtained in the sym- 
metrical multinomials since the number of different partitions of a number 
N will be smaller than the number of configurations, so that there will be fewer 
points of increase for the distribution of the discrete x?. But even in these 
multinomials, the assignments to the probability class were found to give 
reasonable agreement by Lancaster [8]. 


8. THE CONDITIONS FOR A CRITERION OF SIGNIFICANCE 


It is possible to lay down some conditions which the probability, w, assign- 
able to any set of experimental results should possess. 


(i) w=0 should be impossible. This is necessary if —2 log, w is to have finite 
expectation and variance. It is desirable also that w=1 should be impossible 
in view of the possibility that —2 log, (1—w) may be required. 

(ii) In the null case, w should be distributed rectangularly in the range, 0 
to 1. In a single discrete population, this will not be possible but for a class of 
distributions it may be true for all a and b such that 0<a<b<1, that the ex- 
pectation of w falling in the interval, (a, 6), is proportional to (b—a) approxi- 
mately. 

(iii) w should be easy to compute. 

(iv) The same judgment must always be made on the same data. w should 
be uniquely determined by the data. 

(v) If an hypothesis is rejected after an experiment at a certain level, 
further results unfavourable to the hypothesis should not have been able to 
reverse this judgment. 

(vi) The statistical judgment should not be dominated by the auxiliary 
random experiment. 

(vii) There should not be a great discrepancy between the size and (nom- 
inal) significance levels. 
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Now it is easy to see that the exact probability violates (i), (ii), (vii) and some- 
times (iii). The x corrected for continuity violates (i), (ii) and (vii). The 
auxiliary random experiment violates (iv), (v) and (vi) and sometimes (iii), 
(vi) being violated especially in statistical control of counting with small 
numbers because the random sampling will have to be used frequently as in 
the examples above. 

In conclusion, the P(x) test seems to have many desirable properties as a 
test criterion and might well be used in the commonly occurring situations. 
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MULTIPLE REGRESSION ANALYSIS OF A POISSON PROCESS* 
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A multiple regression model based on the Poisson density is dis- 
cussed. Point estimation by maximum likelihood results in estimating 
equations which can be solved by iteration based on scoring or on a 
successively revised weighted least squares estimator. The first round 
of each iterative process is shown to be BAN, provided that the or- 
dinary least squares estimator is used as the initial condition. A mini- 
mum chi-squared estimator, which can be computed directly from 
sample observations, is derived and shown to be BAN. A consistent 
estimator of the variance-covariance matrix for each estimation pro- 
cedure is given. 


1. INTRODUCTION 


EMAND analysis and reliability control provide two important areas of 
D application for multiple regression analysis of a Poisson process. In de- 
mand analysis the dependent variable in the demand relation, quantity de- 
manded, is known to be non-negative so that the usual normality assumption 
cannot be strictly correct. It has been suggested by Hildreth [8] that assum- 


ing that quantity demanded has a Poisson density may be more realistic in 
some applications than the customary assumption of normality. If quantity 
demanded has a Poisson density, the normal approximation to the Poisson 
may be satisfactory, provided that the mean quantity demanded is large; 
however, if zero values of the quantity demanded occur frequently, as they do 
in many cross section applications, the effects of departure from normality are 
likely to be serious. For such studies multi-variate Poisson regression analysis 
provides an interesting alternative technique to the adaptation of probit an- 
alysis given by Tobin [14]. In cross section studies the dependent variable is 
taken to be the number of purchases of a particular commodity by individual 
families during the period of observation; family income, family size, educa- 
tional level, and so on, are taken as independent variables. In time series ap- 
plications the dependent variable is taken to be the number of sales of a partic- 
ular commodity; consumer income, the price of the commodity, and prices of 
other commodities are taken as independent variables. 

In reliability control multiple regression methods provide a means of analyz- 
ing failure mechanisms which are more complex than the “random” mechanism 
underlying the simple Poisson process. Such failure mechanisms retain much 
of the analytical simplicity of the simple Poisson process, including the prop- 
erty that rates of failure do not depend on the age of the equipment. First, the 
failure mechanism may be governed by more than one operating regime. The 
number of failures during each period of observation is taken as the dependent 
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variable in the regression. Times elapsed in each of the various operating re- 
gimes during the period of observation are taken as independent variables. The 
regression coefficients represent rates of failure in the corresponding operating 
regime. Secondly, if a proportion of the elements of the population is defective 
to begin with (or defective under “burn-in” stress) this proportion may be esti- 
mated by taking as one of the independent variables a variable with value 
identically equal to the number of elements in the population (including re- 
placements of failed elements). The corresponding regression coefficient is 
interpreted as the proportion initially defective. Thirdly, if the stress of pas- 
sage from one regime (“on” to “off” and vice versa) is an important source of 
failures as suggested by Stoller [13], the number of times the equipment is 
turned on during each period of observation may be taken as an independent 
variable. The corresponding regression coefficient is interpreted as the rate of 
failure under a regime of “on-off” stress. 

To make the following discussion concrete the terminology of reliability 
control or life testing [2, 5, 9] will be employed throughout. Occurrences are 
referred to as failures and the independent variables as elapsed times. In the 
next section known results on simple regression analysis of a Poisson process 
are reviewed. Maximum likelihood estimates for parameters of a multi-variate 
Poisson process are derived in the following section. Two alternative computa- 
tional methods and certain large sample properties of the maximum likelihood 
and related estimators are discussed. A numerical example, illustrating the 
alternative methods of estimation, concludes the paper. 


2. SIMPLE REGRESSION 
In simple regression analysis of a Poisson process it is assumed that time to 
failure u has the exponential density: 
f(u) = re, (2.1) 


where 1/A= E(u) is the mean time to failure and 2 is the fatlure rate. For any 
time interval, 7, the increment of failures, m, has the Poisson density: 


g(n) = e*T(AT)*/n!. (2.2) 
Given a set of observations m, me, --~-, %m» (failures or occurrences) and 
Ti, Tz, -- - , Tm (elapsed times), the likelihood function for the observed sam- 
ple is: 
L{ny, no, ASE, Nm | r; Ti, T, tet Ta) 


= [Lr aryn/(n)! = exp [-. 3 r.| [larox/in), (2.3) 


t=} i=l 


i=l 


provided that successive observations are independent. The maximum likeli- 
hood estimator for \ is as follows: 


~ Sw, 


i=l 


l 


, 
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where | is the maximum likelihood estimator of 4. The sum of failures and sum 
of elapsed times over all observations provide sufficient information for maxi- 
mum likelihood estimation. The maximum likelihood estimator is unbiased 
and efficient [2], hence a best estimator of \ [4]. The least squares estimator: 


> nT; 
i=1 


(2.5) 


is unbiased but relatively inefficient [1]. 
Confidence intervals for \ and critical regions for tests of hypotheses about A 
may be obtained from the density of 


m 
rn= > n. 


i= 1 


m 


T= > T; 


i=1 
this density is Poisson: 
(AT)" 


g(n|d) =e? - (2.6) 
n! 


so that the usual tables of the Poisson density may be used for inference about 
the parameter X. 

Because of the additive properties of the Poisson process, the same results 
apply to a set of observations (n,;) and (7';;) where n;;, T;; are the number of 
failures observed and times elapsed in the ith cycle of operations and the jth 
environment. Aggregation of observations over cycles involves no loss of in- 
formation. 


3. MULTIPLE REGRESSION 


If the experiment generates observations (n;), (7;;) where n; is the number of 
failures observed in the ith cycle of operations and 7';; is the time elapsed in the 
jth regime during the ith cycle, n; is the sum of k independent Poisson variates 
and has a Poisson density with mean equal to the sum of the expected values 
of the k independent Poisson variates: 


( r MT) 


: k 
hins| Ra, Aa, +» = Xx) = ——— exp[ -  wTu]. (3.1) 


i j=l 





In this section maximum likelihood estimators for the vector \=(A,;) are 
studied. Large sample properties of the resulting estimator are discussed in the 
following section. We observe immediaiely that the requirement m>k is 
necessary for identifiability of the parameters )j. 
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Given a set of independent observations (n;) and (7';) as above, the likeli- 
hood function for the observed sample is given by: 


. (Ear) 
L{(n) | AD; (T)} = IT j=l 


tml i! 





k 
exp | - Ln NT | 


j=l 


- exp| - Da, > ra Il (ee - @.2) 


j=l im i=l n;! 


The maximum likelihood estimator of the vector \ is obtained as follows: 
L=inL=- > > Ti+ ¥ nein ( Ds Tu) — > In (nj), (3.3) 
jal j=l i=l 


so that: 


m niT; 
~--Er+ ¥[- —|=0, (j=1---k. (8.4) 


t=1 tml > L Tas 


j=l 


where 1; is the maximum likelihood estimate of \;. The solution to this system 
of equations may be calculated by the method of scoring [11] (also called the 


Newton-Raphson iterative process). The jth efficient score is given by the 
equations (3.4). The information matrix is given by: 


0 see = NT nT 53 
fae med © er —) (3.5) 
JONI dm; i=] (x iu) 
jf ij 


0 
mle j=l 


where Jf is some initial estimate of \;. In matrix notation, the correction to the 
vector ! of estimates at each iteration is given by: 


I°Al = 8°, 
Il! — 1°) = §°, (3.6) 
Lt = Jo + (1%)-190, 

where /° is the information matrix, S° is the vector of scores, 1° is the vector of 


initial estimates, and /' is the vector of corrected estimates. A second approxi- 
mation is given by: 


2 = J! + (7')-1S', (3.7) 
or in general: 
It = [1 + (J*1)-1§+-1, (3.8) 


If this process attains a fixed point, the vector of scores vanishes and the 
maximum likelihood equations (3.4) are satisfied. For the process to converge 
it is sufficient that I‘ be positive definite for all ¢. The condition that J, the true 
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information matrix, be positive definite is sufficient for the identifiability of the 
d’s. 

An alternative method for estimating \ is the method of least squares. Where 
T is the matrix (7';;), 2 is the vector of least squares estimates and n is the vector 
(n;), the estimators are given by Aitken [1]: 


1 = (T’T)“T'n. (3.9) 

The efficiency of estimation by least squares may be improved by re-estimating 
A, using the fact that: 

E(n — Tr)(n — Tr)’ = Q, (3.10) 


where 2 is the variance-covariance matrix of the variables n,. If these variables 
are independent, Q is given by: 


- k 
Dh MT 


j=l 








Estimating 2 by H®, where each 4; is replaced by the corresponding least squares 
estimator, revised estimates are given by: 


bh = (T'(H)“T'(H)—"n. (3.12) 
This process may be reiterated yielding a second revised estimate: 
2 = (T'(H)T)“T'(H)—"n, (3.13) 
and so on; in general: 
te = (T' (A) T)1" (H™)—"n, (3.14) 
which converges provided that H‘ and (7"’(H‘)-'T)— are positive definite for 


all ¢. If this process attains a fixed point, the maximum likelihood equations 
(3.4) are satisfied since: 


T’H“Tl = 1"H—"n, (3.15) 
where H and / are revised least squares estimators of 2 and I, may be written: 


= UT's ; 
Lts= LS ss G=1--- hh, (3.16) 
t-—1] tml 
DUT 
j=l 
which are the maximum likelihood equations. For normal, heteroscedastic re- 
gression Fisher [7] has shown that not only do revised least squares and scoring 
methods converge to a solution to the maximum likelihood equations, but the 
iterative processes themselves are identical. 
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4. SAMPLING THEORY 


In the case of simple regression the maximum likelihood estimator is known 
to be efficient and unbiased in small samples; furthermore, this estimator is 
easy to compute. For multiple regression the situation is considerably less 
favorable; only large sample properties of the estimators are easy to obtain.* 
As usual the maximum likelihood estimators are best asymptotically normal 
(BAN) in the sense of Neyman [10] with variance-covariance matrix equal to 
the inverse of the information matrix (3.5). The variance-covariance matrix 
may be estimated consistently by replacing each 4; in (3.5) by any consistent 
estimator of \;. For large samples, the maximum likelihood estimators (3.4) 
and the corresponding variance-covariance matrix (3.5) may be used to obtain 
critical regions and confidence intervals for purposes of inference. 

The principal disadvantage of the maximum likelihood estimator is the 
burden of computation; either of the iterative processes examined in the pre- 
vious section is quite tedious and rapid convergence is by no means assured. 
It would be desirable to reduce the computational burden by obtaining alterna- 
tive BAN estimators which are easier to compute. In this section three such 
estimators will be discussed. It will be found that the first-round revised least 
squares and scoring estimators ?/' and l' are BAN, provided that the initial 
value of each iterative process is taken to be the ordinary least squares estima- 
tor, which is known to be consistent. A third estimator, which may be com- 
puted directly from the sample observations, is generated by the x? minimum 
method [10, 6]. This estimator is also BAN. 

To show that /', the first-round scoring estimator of \ is BAN, we expand the 
log-likelihood function Z in the neighborhood of /, the least squares estimator, 
obtaining: 


5L()) 
nN 


PL ()) 


‘ 
Lar) = LO + A-)+ > Aa-V'——Aa-)+RO, 


(4.1) 


P 1 
Li) +S8A-)->Aa-D)TPA—) + RO, 


where R()) is a remainder, 5£(2)/éd is a vector of first partial derivatives of L, 
and 5L£(1)/s\? is a matrix of second partial derivatives of Z, all evaluated at 
\=1. Since ? is consistent, R(2) goes to zero as sample size increases; dropping 
the remainder term, Z£(A) is maximized by setting the vector of first partial 
derivatives equal to zero: 

5 L(A) 


—- # - Mede (4.2) 


where J is set equal to /' the first-round scoring estimator since: 
Lb = 1+ (1%)-189, (4.3) 
The estimator /' is easily seen to be BAN [6, p. 1050]. 





* Throughout the following discussion, consider T =[7;;] in the form T =[pr;;] where [r;;] remains fixed and 
the scalar p increases as sample size increases. 
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To generate the two remaining BAN estimators of A, consider x? given by the 
expression : 


x? = (n — Tr’) 2"(n — Th), (4.4) 


where © is the variance-covariance matrix (3.11). Provided that 2 is known, 
minimization of x? yields: 
6x? 
— = — 27’2'TI* — 2T’2"n = 0 (4.5) 
DN 
so that: 
[* = (7’2"'T)“T’ 2", (4.6) 


where the variance-covariance matrix for /* is: 


62y2 


as = (T’'2"7)-, (4.7) 


The estimator /* is best, linear, unbiased for small samples [1], as well as BAN 
[6]. Although the variance-covariance matrix 2 is unknown in most applica- 
tions, the estimator obtained by replacing 2 in (4.6) by any consistent estima- 
tor retains the BAN property, subject to certain regularity conditions [6, 
p. 1048]. Of course, the small sample properties of [* are no longer assured. 
Two consistent estimators of 2 will be considered. First, replace 2 by H° as in 
ia the resulting estimator is the first-round revised least squares estimator 
1. 

bh = (T'(H°)T)“T'(H)—"n, (4.8) 


which is a modified x? minimum estimator and hence BAN [6, p. 1048]. A 
second estimator, closely related to the first, is obtained by replacing 2 by N, 
a consistent estimator of 2, where: 


The resulting modified x? minimum estimator is: 
) = (T'N-T)"T'N-n, (4.10) 

which is BAN; the computations may be shortened by observing that: 

DT 

7; 


T'N-n = (4.11) 


} > Tes 


The estimator ?, a kind of weighted least squares estimator with known weights, 
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may be computed directly from the sample observations without resorting to 
successive modification of the ordinary least squares estimator as in scoring or 
revised least squares. For samples of small or even moderate size, this estimator 
may be unsuitable because of the occurrence of zero values of the dependent 
variable n,. 

For any BAN estimator, the variance-covariance matrix of the estimators 
for large samples is estimated consistently by the inverse of the information 
matrix (3.5), evaluated at any consistent estimator of \, for example, the least 
squares estimator 2. An alternative consistent estimate of the variance-co- 
variance matrix may be obtained by replacing 2 in (4.7) by a consistent estima- 
tor, say H® or N from (3.12) or (4.9), respectively. The estimated variance- 
covariance matrix may be employed in the usual way to obtain tests which are 
asymptotically equivalent to likelihood ratio tests [10, p. 259 et seg.]; con- 
fidence intervals for individual estimators or any combination of estimators 
may be estimated from the joint normal distribution with expectation equal to 
the BAN estimator and variance-covariance matrix given by the correspond- 
ing variance-covariance estimator. 


5. NUMERICAL EXAMPLE 


In this section a numerical example is given in which first-round scoring and 
revised least squares estimates are computed, using ordinary least squares 
estimates as initial conditions for the appropriate iterative process. The data 
employed consist of failures of a complex piece of electronic equipment under 
cycles of operation (weeks) in which each cycle is divided into two operating 
regimes. Where n; is the number of failures in the ith cycle, T,; and T2; are 
times spent in regime one and regime two during the ith cycle. The failure model 
is: 


n= MT; oe AoT'x:, (t = 1, ces, ,m). (5.1) 
The original observations are presented in Table 1. The least squares estimates 


TABLE 1. ORIGINAL OBSERVATIONS 








a | a | 2 | 18 | 











125.9 116.3 131.7 | 85.0 
97 .6 53 .6 56.6 | 87.3 





of A; and A¢ are: 
1, = 1725; =}, = .0657. 
For these estimates, the vector of scores (3.4) is given by: 
Si = — 836.9 + 863.3 = 26.4, 
S; = — 436.6 + 454.9 = 19.3. 


The information matrix (3.5) and its inverse are given by: 
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‘aon _ we 001255 —.002003 
2126 1332’ ~ L—.002003  .003948 


so that the first-round maximum likelihood estimate given by (3.6) is: 


| (5.4) 


1, = .1725 + .0331 — .0387 = .1669, 


1 (5.5) 
l, = .0657 — .0529 + .0762 = .0890. 


To compute the revised least squares estimates, using the ordinary least 
squares estimates as initial conditions for the iterative process (3.14), first 
calculate the diagonal element of the matrix Ho, namely #;=),72;+2.T2; for 
i=1, - --, m; these elements are given in Table 2. 


TABLE 2. DIAGONAL ELEMENTS OF #H°® 











nr | 7.42 | 9.95 | 13.29 | 24.98 | 28.13 | 23 .57 | 26.43 | 20.40 19.00 





Then the matrix (T7’(H°)-'T) and its inverse are given by: 
4071 2035 
2035 1279 | 
.001202 — .001913 
— .001913 poy 


rar) = | 


rqnyy-*7)-+ = | 


The vector T’(H®)~n is given by: 


863.3 
T’(H°)-n = [ ] ’ 
454.9 


so that the revised least squares estimates (3.12) are then: 


1, = 1.0376 — .8702 = .1874, 


5.8 
2) = 1.6514 + 1.7417 = .0893. 6.8 


The weighted least squares estimator ? may be computed directly from the 
data. The matrix T’N~'T and its inverse are given by: 


4086 2029 
2029 wg f 

.001133 —.001790 
— .001790 powved f 


(1’N—T) = [ 
(T'NT)-) = [ 


The vector T’N~'n is given by: 


T’N-n = Ps “ye a2 — | 


> 1 435.6 
so that the weighted least squares estimator } is given by: 
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}, = .1697, 
}, = .0699. 


(5.11) 


For the first-round scoring estimator, the large sample variance-covariance 
matrix is estimated by the inverse of the information matrix, namely (J°)—. 
For the two minimum x? estimators /! and ?, the large sample variance-covari- 
ance matrix is estimated by (7’(H°)“T)— and (T’N“T)-—, respectively. Al- 
though it would not be necessary to compute any of the three inverse matrices 
in the course of calculating the solutions to the corresponding systems of linear 
equations, it is useful to compute these matrices if the resulting estimators are 
to be used for tests of hypotheses or confidence interval estimation. The amount 
of computation involved in calculating the first-round scoring and revised least 
squares estimators is about the same. The computation required for the 
weighted least squares estimator is considerably less. 


6. CONCLUSION 


For multiple regression analysis of a Poisson process, a maximum likelihood 
estimator of the regression parameters may be computed by two alternative 
iterative processes. However, it is possible to show that the first iteration of 
each process has the same large sample properties as the maximum likelihood 
estimator itself, provided only that the initial value for the process is a con- 
sistent estimator of the regression parameters. The requirement of consistency 
is fulfilled by the ordinary least squares estimator. A further estimator with 
the same large sample properties, based on weighted least squares regression, 
may be computed directly from the sample observations. Although the sampling 
theory discussed in the previous section is based on large sample approxima- 
tions, this theory may be adequate for the two main applications of multi- 
variate Poisson regression suggested in Section 1: cross-section studies in de- 
mand analysis and reliability control. 

An extension of the results of this paper to problems in the analysis of vari- 
ance and the analysis of covariance may be of interest. In previous literature 
there is much discussion of the transformation of data generated by a Poisson 
process to approximate normality by various transformations related to the 
square root transformation [3; 12, p. 364, and the references listed there]. 
Although transformations may be found which are quite accurate, any trans- 
formation like the square root destroys the property of additivity possessed 
by data generated from a Poisson process [12, p. 367]. Regression methods, 
like those examined here, may be employed to estimate the parameters of a 
Poisson model directly, preserving additivity; normal sampling theory may be 
used for inference in large samples. 
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CONFIDENCE CURVES: AN OMNIBUS TECHNIQUE FOR ESTIMA- 
TION AND TESTING STATISTICAL HYPOTHESES! 


ALLAN BrrNBAUM 
New York University 


A standard practice of physical scientists is to report estimates 
(“measurements”) accompanied by their standard errors (or alterna- 
tively “average errors” or “probable errors”). Such reports are inter- 
preted flexibly as appropriate in various contexts of application. With 
the usual normality assumption, such reports may be read as repre- 
senting confidence intervals or limits at the various confidence levels, 
and this omnibus character largely accounts for the convenience and 
flexibility of such reports and interpretations. For estimators not nor- 
mally distributed, a formal analogue of such reports is provided by 
confidence curves, which are estimates of an omnibus form incorporating 
confidence intervals and limits at various levels. The definition and com- 
putation of such estimates, and their graphical representation and inter- 
pretation, are discussed and illustrated by an example. 


1, INTRODUCTION AND SUMMARY 


His note describes estimates of an omnibus form, called a confidence curve, 
"Thich incorporates confidence limits and intervals at various levels and a 
median-unbiased point estimate, together with critical levels of tests of hy- 
potheses of interest, and representations of the power of such tests. 

Such estimates can be represented conveniently and interpreted flexibly, 
as seems appropriate for typical purposes of informative inference, that is, pur- 
poses of representing suitably the general significance of observations as evi- 
dence relevant to the determination of values of unknown parameters. The use 
of such estimates avoids the need for adoption of a particular confidence level, 
which is overly schematic for such purposes. Much of current practice in ap- 
plied statistics involves flexible use and interpretation of confidence intervals 
and confidence coefficients; confidence curve estimates provide an explicit 
unified form representing such practice. Estimates of omnibus form have been 
proposed by Tukey [6] and Cox [4]; the confidence curve form, illustrated 
below, has proved convenient in the development of some new theoretical and 
computational methods of estimation reported elsewhere [1]. 

The use of tests, especially without consideration of power, is also overly 
schematic for many purposes of informative inference, and there is increasing 
awareness that many problems customarily formulated in terms of testing can 
be treated more appropriately as problems of estimation. The recent paper by 
Natrella [5] describes this trend and some reasons for it, and illustrates how the 
relation between confidence intervals and tests facilitates a shift of formulations 
and techniques. In this connection, confidence curve estimates can be inter- 
preted as representing critical levels and power of tests when particular hy- 
potheses are of interest, or as incorporating testing techniques within estima- 
tion techniques when the latter are more appropriate. 

Confidence curve estimates can generally be obtained by simple adaptations 
of standard estimation and testing techniques. 





1 Research supported by the Office of Naval Research. 
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Questions of justification of confidence curve estimation from the stand- 
point of the foundations of statistical inference lie outside the scope of the 
present note, but are discussed elsewhere [2], [3]. Problems of efficiency of 
confidence curve estimates are also discussed elsewhere [1]. 


2. CONFIDENCE CURVES: DEFINITION AND EXAMPLE 


In problems of estimation of a real-valued parameter, a confidence curve 
estimate is formally defined simply as a set of upper and lower confidence lim- 
its, at each confidence coefficient from .5 to 1, inclusive; or alternatively as a 
set of “equal tail-area” confidence intervals, at each confidence coefficient from 
0 to 1, inclusive. It is natural to require that for each possible sample, the con- 
fidence intervals are nested, those with larger coefficients including those with 
smaller coefficients; this restriction is easily met in practice. 

In any specific application, the form and degree of completeness in which 
such estimates are reported can, of course, vary greatly, according to the judg- 
ment and convenience of those using the estimates. For many purposes, a 
rough indication, based on several confidence intervals or limits and perhaps a 
point estimate, may suffice; and even when a single confidence interval or 
limit is sufficient for some purpose of informative inference, it may be helpful 
to regard it as a very incomplete indication of a complete confidence curve, so 
as to avoid overly schematic interpretations. 

For some applications a graphical representation will be convenient, and the 
term “confidence curve” may be used also to refer to the following specific 
graphical form of an estimate: For each c, 0<c<.5, let @z(t, c) and @u(t, c), 
respectively, denote lower and upper confidence limits for an unknown pa- 
rameter @, at the 1—c level, based on the observed value of some suitable sta- 
tistic ¢. Such a pair of estimates also represents a 1—2c level confidence inter- 
val. In the (6, c) plane, for each c<.5 plot the two points (@z(t, c), c) and 
(@u(t, c), c). For c=.5, we have 6,(t, c)=6@u(t, c), which is a median-unbiased 
point estimate of 0, represented by the point (@z(¢, .5), .5). (A point estimator 
of @ is called median-unbiased if its probabilities of over- and under-estimation 
are equal, for all 6.) We denote this graph, or the function of @ which it repre- 
sents, by c(@, t). In most problems of interest, such a graph is continuous and 
resembles that in Figure 1. 

In problems involving many familiar simple distributions, the best choice 
of the statistic ¢ is the usual sufficient statistic which leads to uniformly best 
one-sided tests and to the corresponding standard best confidence limit esti- 
mators. (In more complicated problems, efficiency considerations lead to meth- 
ods of confidence curve estimation not necessarily based on use of a single sta- 
tistic t; the theory of such methods, and numerical examples, have been re- 
ported elsewhere [1].) Once a basic statistic ¢ has been chosen, then for each 
possible value of ¢ and of c, the corresponding value of 6y(t, c) is determined by 
the usual condition for defining a confidence limit estimator, namely: @y(t, c) is 
that value of @ for which the 100c% point of the statistic’s distribution equals 
t. Similarly, 6z(t, c) is the value of @ such that the 100(1—c)% point equals ¢. 

When the statistic ¢ has discrete distributions, one may choose one of the 
usual alternatives: (a) when the discontinuities are small, and a convenient 
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close continuous approximation formula is available, as in the binomial ex- 
ample below, the discontinuities may be ignored and the approximations may 
be used for typical purposes; (b) confidence limits having bounded, but not 
constant, confidence coefficients may be used as the elements of a confidence 
curve; or (c) in principle, randomization could be used to give effectively con- 
tinuous distributions. 

Example. Consider the problem of estimation of a binomial mean (proportion) 
p, based on n=75 observations and an observed proportion =45/75=.6. 
The statistic t= f/=x/n has a distribution which is approximately normal, 
provided that np and n(1—>p) are not very small, namely 


Prob(X/n S v| p) = ®(n2(v — p)/(p(1 — p))*’), (1) 


“ 


where ®(u) = Qn) f exp (—}u?)du. 
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Fie. 1. Confidence curve estimate of a binomial proportion p based on n=75 
observations and an observed proportion p =2/n =45/75 =.6. 


For such binomial problems generally, it is easily verified that the approximate 
formula for a confidence curve estimate is 


e(p, p) = &(—n"*| p — p| /(p — p))"), (2) 


the approximation being close except where np, n(1—p), np, or n(1—p) are 
very small. In the present example, we have 


c(p, .6) = &(—(8.66) | p — .6| /(p(1 — p))*”). (3) 


We observe that c(.6, .6) =.5, and we may evaluate c(p, .6) for as many addi- 
tional values of p as we wish to complete a sketch of the estimate e(p, .6). Such 
computations can, when desired, be based on the rougher approximation for- 
mula obtained by replacing p(1—p) by #(1—) in the above formula; or when 
convenient they may be based directly on available tables of the binomial dis- 
tribution, or on available tables or graphs giving binomial confidence limits at 
a number of levels. A graph of the confidence curve estimate c(p, .6) is given 
in Figure 1. 

A confidence curve as a whole is an estimate; its over-all meaning and inter- 
pretation can be illustrated in part by interpretations of its various parts, 
which coincide with customary types of estimates and tests and their custom- 
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ary interpretations when used for informative inference. For example, the 
binomial confidence curve above admits the following partial interpretations, 
not all of which would be of particular interest in any single context of applica- 
tion. Depending upon the completeness and accuracy of plotting, such inter- 
pretations may be read more or less accurately by inspection of a graph such 
as Figure 1: 

1. A point estimate of p is .6. (Here and in many common examples the best 
median-unbiased estimates obtained in this way coincide exactly or very 
nearly with the usual best unbiased (mean-unbiased) estimates, and with maxi- 
mum likelihood estimates, except for very small sample sizes in some prob- 
lems.) 

2. A lower 95.5% confidence limit for p is .5. 

3. A 98.4% (equal tail-area) confidence interval for p is (.45, .74). 

4. For the hypothesis p<.5 (or p=.5), against the alternative hypothesis 
p>.5, the critical level is .045 (“just significant at the 5% level”). The power 
of the test which rejects when the critical level is at least as small as that ob- 
served is: (a) against the alternative p=.74:.992=1—c(.74, .6); (b) against 
the alternative p=.57:only .295=c(.57, .6). (When the critical level of a one- 
sided test is less than .5, as in cases of principal interest, then the power of the 
corresponding test is given as just illustrated, by c(0, ¢) for alternatives @ on 
the same side of the maximum of the curve as the null hypothesis value(s) of 
6; and by 1—c(@, ¢) for alternatives on the opposite side of the maximum.) 

5. For the hypothesis p>.5, the critical level is .955, which is far from sug- 
gesting rejection. 

6. For the hypothesis p=.5, against the two-sided alternative, the critical 
level (of the equal tail-area test) is .09 (twice that found in the one-sided test 
in 4. above). 

For simultaneous estimation of several parameters, analogous methods can 
be based upon use of nested families of confidence regions. Their graphical 
representation is difficult, except in the case of two parameters where the 
boundaries of a number of confidence regions, at selected levels, can be sketched 
and labeled by their confidence coefficients. 
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CHANGES IN THE SIZE DISTRIBUTION OF DIVIDEND INCOME* 


Epwin B. Cox 
Boston University 


Changes in the distribution of individuals’ dividend income by in- 
come group afford a measure of changes in the distribution of indi- 
vidually owned stock by income group. Between 1917 and 1957 the 
share of individuals’ dividend income going to persons in the top per- 
centile of the income distribution has halved, while the share going to 
persons in the upper five percentiles has fallen by one-third. The decline 
has been irregular, about half of it occurring in the periods 1927-1931 
and 1937-1943. This evidence of change in the distribution of individu- 
ally owned stock by income group contrasts with evidence which sug- 
gests an absence of change in the distribution by wealth group. 


1, INTRODUCTION 


N THE last decade the number of adults owning stock in the United States 

has more than doubled, from less than six to more than twelve million per- 
sons. This increased participation in the market has been accompanied by a 
prolonged rise in stock prices. Further, there has been a great deal of publicity 
given to the idea that more people should “own a share in American business,” 
through purchase of stock. These inter-related developments have stimulated 
new research into a challenging area, the characteristics of the distribution of 
stock ownership. Who are the people who “own a share in American business” 
and how are these shares distributed among these owners? Briefly, who owns 
how much stock? 

This question is currently being studied from several points of view. Com- 
panies are studying their own stockholders, brokers are studying their cus- 
tomers and a number of individuals and agencies are studying all stockholders 
as a group. In contrast with the amount of effort devoted to estimating the 
present distribution of stock ownership, slight attention has been given to the 
development of a consistent picture of the historical changes in the distribu- 
tion. 

In this paper we have tried to develop the best possible estimates of the 
changes in the distribution of dividends among income groups in the popula- 
tion from 1917 to 1957. The basic data are those on dividend income of persons 
from the Individual Income Tax returns as tabulated in the Statistics of In- 
come [10]. The method is a modified form of that used by Professor Kuznets 
in Shares of Upper Income Groups in Income and Savings [5]. This method 
permits us to see changes in the share of total dividend income received by 
those individuals in the top one and the top five per cent of all persons grouped 
by size of income. These are Kuznets’ “upper income groups.” 

Stating the results briefly, in 1957 those in the top one per cent of persons 
grouped by income received a share of all dividends about half as large as in 
1917 and about two-thirds as large as in 1930. The 1957 share of those in the 

* The research on which this paper is based was performed while the author was a graduate student at the 
University of Pennsylvania. The author wishes to acknowledge the guidance given by Professor Irwin Friend and 
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top five per cent was two-thirds as large as in 1917 and four-fifths as large as 
in 1930. There has been a pronounced decline in these shares, with about half 
of the decline coming in the periods 1927-1931 and 1937-1943. 

Between 1928 and 1956 there has also been a decline in the share of divi- 
dends received by the persons in the highest dividend income groups. The 
share of dividend income received by persons in the top one per cent of divi- 
dend recipients fell more than forty per cent, while the share of the top five per 
cent fell more than twenty-five per cent. This change is certainly not incon- 
sistent with the change noted above. (On the other hand, it is not necessarily 
implied by it either.) 

For the purposes of this discussion the assumption will be made that esti- 
mates of the distribution of dividends by income groups or by amount of divi- 
dends received group are an acceptable substitute for the distribution of stock 
among income or amount of stock owned groups. This assumption is most rea- 
sonable when only the upper income groups are considered for these groups 
have always had to file returns and their returns are most likely to be accurate. 
Fortunately, the skewness of the distributions is so great that changes are 
shown very well by studying only the top percentiles of the distributions. 

We are forced to assume that the distributions among income groups of divi- 
dend paying and non-dividend paying stock are similar and that the same 
capitalization rate can be applied to dividend income at all income levels in 
estimating the value of stock owned from the amount of dividend income re- 
ceived. It can be argued that increasing progressivity in the income tax rate 
over time has brought a move among upper income investors from high yield 
stock to lower yield stocks, stocks paying no dividends, or to bonds. The evi- 
dence on both sides of this argument is highly conjectural [2]. We do not be- 
lieve that the entire change shown in the data to be examined here can be ex- 
plained away by this phenomenon. The assumption of a uniform capitalization 
rate appears reasonable in light of available data. Atkinson’s study of Wiscon- 
sin Income Tax returns showed that the rates of return on stocks held by per- 
sons in different income groups varied little from group to group [1]. His data 
show no tendency for upper income groups to hold lower yielding issues. 


2. BACKGROUND 


Evidence on the characteristics of individuals who own stock is available 
from three widely publicized surveys completed since 1952 by or for the New 
York Stock Exchange [4, 12, 8]. However, these surveys do not provide the 
information needed to estimate the distribution of individually owned stock 
by income groups, since the amount of stock owned by individuals is not re- 
ported. 

The amount of stock owned has been occasionally reported for the spending 
units surveyed in the periodic Surveys of Consumer Finances. The Survey, 
which annually reports on about 300 stock-owning spending units, does not 
yield sufficient information on the stock-owning spending units in high income 
groups, where large size holdings are found, to make satisfactory estimates of 
the distribution of individually owned stock by income groups. By pooling 
data from two surveys which included the same questions on income and 
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amount of stock owned, using size classes covering larger incomes and amounts 
of stock, the Survey authors have recently constructed an estimate based on 
interviews with more than 680 stock-owning spending units [9]. This appears 
to be the best estimate based on survey data which has been developed. An 
estimate of the distribution is available for 1948, based on the Survey findings 
for spending units with low and middle incomes and Statistics of Income data 
for those with high incomes [2]. Surveys have not produced the data necessary 
for a series of consistent estimates of the distribution extending over several 
years. 

In Kuznets’ Shares of Upper Income Groups in Income and Savings we find 
the only series of estimates which affords a basis for comparison of consistent 
figures over a long period of time. A study of long term changes in the distribu- 
tion of individually owned stock by income group must begin here. Prior to 
1940, Kuznets’ estimates cover only the dividends received by the top five 
per cent of persons, ranked by per capita total income. For a few scattered 
years for which the data permit it, the shares of the top ten per cent of persons 
are given. Beginning with 1940, when the coverage of the tax was broadened 
considerably, it is possible to estimate the share of the top twenty per cent. 
Kuznets’ estimates extend from 1917 to 1948. Using essentially the technique 
developed by Kuznets, let us bring his results up to date and then examine the 
total picture from 1917 to 1957. 


3. METHODOLOGY 


Kuznets’ method iavolved translating the number of returns in an income 
class into the number of persons dependent upon the income reported on those 
returns. Within each income class the income was then divided by the number 
of persons and the groups ranked by per capita income. The result was an ap- 
proximation to the upper percentiles of the distribution of the total population 
and their incomes, including a breakdown of income by source, by per capita 
income. Logarithmic interpolation in the income columns led to estimates of 
the shares of persons with the largest incomes in total income and income from 
various sources. Dividend income was just one of the types analyzed in this 
fashion. 

We have modified this method in two respects. Professor Kuznets’ co- 
worker, Miss Elizabeth Jenks, has said that these modifications would have 
been a part of the method originally used if the necessary data had been avail- 
able [7]. These modifications are discussed in the Appendix. It will be suffi- 
cient here to indicate very briefly what is involved. The nature of the data 
forced Kuznets to treat taxable fiduciary returns as though they were returns 
from individuals filed in their own behalf. Because of this dividends received 
by a taxable fiduciary are treated as dividends received by an individual with 
the income of the fiduciary. In addition, the data forced Kuznets to use an 
unsatisfactory method for estimating dividends received by individuals in 
each income group from fiduciaries. With data now available, we have for- 
tunately been able to cope with these difficulties more satisfactorily. 

The collective effect of our adjustments in the original method is to reduce 
the share of dividends received by the upper one per cent of persons by about 
eight percentage points. The share received by the upper five per cent is re- 
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duced by about ten percentage points. However, comparison of the shares 
found by the original method and the shares found by the modified method 
for the years when both were used shows that the pattern of change through 
time shown is the same. This is most significant since our purpose is an exam- 
ination of the change in these shares through time. 

in order to make the historical comparison as clear as possible, an index has 
been constructed showing changes in the share of the upper one and five per 
cent of individuals in dividends for most years since 1917, using 1957 as a base. 
The index is based on Kuznets’ estimates for 1917 through 1948 and the auth- 
or’s from 1939 to 1957. A reconciliation between the two sets of estimates was 
needed because of the difference in methods. This was accomplished by dividing 
Kuznets’ figures by the arithmetic mean of the ratios of his figures to the 
author’s, calculated for each year where two figures were available. After this 
adjustment small differences remained between our estimates and Kuznets’. 
These differences were removed by using the arithmetic mean of the two figures. 
Within Kuznets’ series of estimates there are also changes in method of ecompu- 
tation which necessitate a similar linking arrangement, These differences are 
caused by the differences in the estimates of total dividend payments used by 
Kuznets to estimate the shares of upper income groups. Between 1917 and 
1919 those of Wilford King were used. For the period between 1919 and 1938 
Kuznets used his own and for the years starting in 1929 those of the Office of 
Business Economics in the Department of Commerce. This explains the fact 
that there are two sets of estimated shares reported by Kuznets for the period 
1929-1938. Our estimates of the shares from 1939 to 1957 are based on the De- 
partment of Commerce totals also. 

The index shows the shares of the upper one and five per cent of individuals 
in total dividends for each year, in terms of the share in 1957. The actual fig- 
ures for 1957 are 38.3 per cent received by the top one per cent and 54.0 per 
cent received by the top five per cent. The adjustments described in the pre- 
ceding paragraph were designed to make the figures for all other years com- 
parable with these figures for 1957. 

We have also attempted a comparison of the shares of dividends received by 
individuals receiving or benefiting from the largest amounts of dividends. Has 
there been a change in the share of dividends received by the top one or five 
per cent of dividend receivers? The data are extremely limited and adjustment 
to exclude taxable fiduciary returns impossible. However, a comparison be- 
tween 1928 and 1956 can be made to give a minimum measure of the decline 
in the shares. This was done by imputing to the appropriate income group of 
individual dividend recipients in 1956 all dividend income reported by taxable 
fiduciaries in that income group. To determine the number of individuals com- 
prising the top one or five per cent of persons benefiting from dividends we 
had to make an estimate of the total number of such persons in the population. 
The estimate was based on the findings of the 1956 Census of Shareowners of 
the New York Stock Exchange and the Surveys of Consumer Finances for 1955 
and 1957. 

4. RESULTS 


Table 1 presents the estimated shares of dividends received by upper income 
groups, as calculated by Kuznets for 1917-1948 and the author for 1939-1957. 
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The index described above is shown in Table 2. In both tables the entries for 
1944 and 1945 are missing because dividends were reported in combination 
with interest on tax returns for those years. The values for 1950 and 1953 can- 
not be found because no data from fiduciary returns have been published for 
those years. This made it impossible to determine the composition of fiduciar- 
ies’ incomes, as required by our modified method of estimation (see Appendix). 


TABLE 1. SHARES OF UPPER INCOME GROUPS IN DIVIDEND INCOME 
1917-1957 
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Source: 1917-1948—Simon Kusnets, Shares of Upper Income. Groups in Income and Savings (New York: 
National Bureau of Economic Research, Inc., 1953), Table 123 and Addendum Tabie 3, 
1939-1957—Author's calculations. 

In addition to the annual values, centered five-year averages are shown in 
Table 2. These suggest the pattern of decline in the shares better than the 
annual figures with their short run fluctuations. 

The one per cent of persons receiving the largest dividends in 1928 received 
approximately fifty-six per cent of dividends paid. This share declined to not 
more than thirty-two per cent by 1956. For the top five per cent of persons the 
corresponding percentages are seventy-four and fifty-four. These figures in- 
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TABLE 2. INDEX OF SHARES OF UPPER INCOME GROUPS 
IN DIVIDEND INCOME 


1917-1957 
(1957 = 100) 








Share Index Centered Five-Year Averages 





Year 
Top 1% Top 5% Top 1% Top 5% 





1917 197. 
1918 168 
1919 181 
1920 177 
1921 160 
1922 176 
1923 158. 
1924 168 
1925 166 
1926 180. 
1927 179 
1928 175 
1929 159 
1930 149 
1931 143 
1932 145. 
1933 141. 
1934 144. 
1935 133. 
1936 147. 
1937 153. 
1938 136. 
1939 138. 
1940 142. 
1941 127. 
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clude fiduciaries as persons. As we pointed out earlier, this is undesirable but 
unavoidable. If fiduciaries could be excluded the figures would be smaller but 
the change would probably be about the same. The 1956 figures, excluding 
fiduciaries, are twenty-seven and forty-seven per cent. 
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It should be noted again that “persons receiving dividends” refers to persons 
filing tax returns reporting receipt of dividends and their dependents. “Divi- 
dends” refers to dividend income of the person filing the return and to the por- 
tion of fiduciary income paid to him which originated in dividends received by 
the fiduciary. Thus, the shares shown in Table 1, and on which Table 2 is 
based, are ratios in which the numerator is dividend income as just defined, 
and the denominator is the total of all dividends paid by corporations in the 
United States, excluding payments to other corporations. To be strictly com- 
parable with the numerator this total in the denominator should be adjusted 
to exclude dividends which are not received by individuals, e.g., dividends re- 
ceived by mutual insurance companies, non-profit organizations, non-insured 
pension funds and fiduciaries (if not subsequently distributed to individuals) 
or which need not be reported as income on tax returns, e.g., dividends repre- 
senting distribution of capital gains. According to Holland’s estimates [11], 
during the period 1936 to 1957 the magnitude of this adjustment has varied 
between five and nineteen per cent of the total reported by the Department 
of Commerce. If the shares as shown in Tables 1 and 2 appeared to decline 
only during periods when the magnitude of the adjustment increased, and to 
increase when the adjustment decreased, it would suggest that the decline is 
only a statistical illusion. This has not been the case, even for short intervals 
within the twenty-one-year period. For the entire period, the adjustment in- 
creased about three percentage points while the share of the highest percentile 
in dividend income fell over fifteen percentage points. 


5. CONCLUSIONS 


The share of dividends received by persons in the upper income groups has 
shown a long run tendency to drop since 1917. Nearly half of the total drop 
came in rapid declines during the periods 1927-1931 and 1937-1943. Since 
1946 a brief two-year rise in these shares has been followed by a nine-year 
gradual decline through 1957, the most recent year for which data are avail- 
able. Between 1928 and 1956 there has also been a decrease ia the share of divi- 
dends received by persons in the upper groups of dividend recipients. 

We have made the assertion that these changes in the distribution of divi- 
dends reflect changes in the distribution of stock ownership as well. Therefore, 
we conclude that tax return data indicate there has been a reduction in the 
concentration in the distribution of stock owned by individuals, directly or 
through fiduciaries, in the various income groups. This result apparently can- 
not be tested directly with any body of independent data presently available. 
It can, however, be compared with the findings of Robert Lampman on the 
distribution of stock ownership among individuals in the various wealth groups 
[6]. Lampman’s results show that the wealthiest one per cent of adults held 
seventy-six per cent of the corporate stock held by individuals in 1953. In 
earlier years this share changed as follows: 1922, 61.5 per cent; 1929, 65.6 per 
cent; 1939, 69.0 per cent; 1945, 61.7 per cent and 1949, 64.9 per cent. Thus, it 
seems that the distribution of stock by wealth groups has not become less con- 
centrated since 1922. Admitting that the 1953 figure seems unreasonably 
high, Lampman did not vent ure the assertion that the concentration had actu- 
ally increased. 
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Lampman’s results are derived from the Federal Estate Tax returns by the 
application of the “estate multiplier” technique. In this technique, the de- 
cedents in each wealth class are assumed to be typical of the living with respect 
to composition of assets. Multiplying the value of stock in the estates of the 
decedents by the reciprocal of the death rate prevailing in that wealth group 
yields an estimate of the value of the stock held by the living in that wealth 
group. Since returns must be filed only for large estates, the data cover only 
upper wealth groups. The values of the stock estimated in this way are ex- 
pressed as percentages of the estimated value of all stock owned by individuals, 
regardless of wealth. 

The lack of a decline in the share of stock held by the wealthiest one per 
cent of adults stands in contrast with our conclusion that the percentage share 
of stock owned by the one per cent of persons, including dependents, in the 
highest income group has declined between 1922 and 1953. The difference may 
be explained in one of three ways. Either Lampman’s figures or ours may fail 
to show what they purport to show. Or, quite possibly, we are both right. It 
must be remembered that our estimates are of shares of stock held by top in- 
come groups while Lampman’s are for top wealth groups. 

If, between 1922 and 1953, certain changes occurred in the characteristics 
of the individuals in the upper income and upper wealth groups, the apparent 
contradiction might be resolved. Possibly, the likelihood that a person in the 
upper wealth bracket would also be in the upper income bracket declined dur- 
ing the period. If so, stock ownership would continue to be concentrated in the 
top wealth group while becoming less concentrated in the top income group. 
During the period covered by Lampman’s findings, the increase in life expect- 
ancy gradually raised the mean age at death of the population generally. This 
means relatively more people lived into the period in life when income tends 
to decline while wealth remains relatively stable. In the later years of life 
stock plays a larger role in an individual’s financial asset holdings than in 
earlier years. There is a tendency to reduce investments in housing, insurance 
and proprietorships or partnerships and shift funds to stocks and bonds for in- 
come. This behavior may mean that at the very time persons in the upper 
wealth groups move from the upper to a lower income group they are inereas- 
ing their holdings of stock. It is possible that, as a result of these phenomena, 
the concentration of stock in the top wealth groups appeared to remain stable 
while the concentration in the top income groups fell. 

Lampman discusses the difference between his findings and Kuznets’. Kuz- 
nets found a larger decrease in the inequality in the size distribution of income 
than Lampman found in the size distribution of wealth. Lampman concludes, 
“Wealth distribution appears to have changed less than income distribution 
during this period” [6, p. 32]. Although this quotation refers to size distribu- 
tions of total wealth and total income, there seems good reason to believe it 
may also apply to one type of wealth in particular, i.e., stock. The distribution 
of stock among wealth groups may have chenged less than the distribution of 
stock among income groups during this period. 

The evidence available from Individual Income Tax returns further suggests 
a reduction since 1928 in the concentration among large dividend receivers of 
dividends received by individuals directly and as beneficiaries of trusts or 
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estates. Does this imply that there has been a like reduction in the concentra- 
tion in the size distribution of the value of stock owned by individuals and by 
them, indirectly, through fiduciaries? The best available evidence relating to 
this question is found in corporation stockholder of record data. Corporate 
records give distributions of shares in record shareholdings, which include 
brokers’, banks’ and institutional holdings as well as fiduciaries’ and indi- 
viduals’ holdings. Allowing for the complications involved in using them, 
such data do suggest a reduction in the concentration in the size distributions 
of holdings since 1937. The author has made a study of these data and intends 
to report on the findings at a later date. No satisfactory results could be ar- 
rived at for the years before 1937. 


APPENDIX 


Until 1937 it was impossible to study data from tax returns filed by indi- 
viduals on their own behalf separated from returns filed by taxable fiduciaries, 
i.e., persons acting as trustees or executors. Since he could not avoid it, Kuznets 
treated estates and trusts in all years as persons and their income as income 
received by persons. In so doing he attributed much dividend income to upper 
income groups. Furthermore, he double counted dividends received by taxable 
fiduciaries and subsequently received by individuals through these fiduciaries. 
Starting with 1939, the data for individuals only have been used in the modified 
method to estimate the shares of upper income groups in dividends. In this 
way the taxable fiduciaries are eliminated and the dividends received by them 
are no longer double counted. 

Before 1936 and after 1953, dividends received by individuals in their ca- 
pacity as beneficiaries of a trust or estate were reported as dividends, not as 
fiduciary income. Dividends, in these years, have been taxed in a special way 
which did not apply to other types of income. To take advantage of the favor- 
able treatment accorded dividends individuals were permitted to show as 
dividends that part of fiduciary income which could be shown to arise out of 
dividends received by the fiduciary. Between 1936 and 1953 dividends re- 
ceived from fiduciaries were included in fiduciary income since dividends were 
taxed like all other types of income. Consequently, an adjustment has to be 
made to add such dividends to regularly reported dividends during this period 
or remove them from regularly reported dividends in all other years. Kuznets 
chose the former. 

He estimated how much of fiduciary income was derived from dividends 
paid to the fiduciary by assuming all fiduciary income received by persons in 
an income class was derived from dividends or interest. He apportioned fiduci- 
ary income between dividends and interest according to the amounts of divi- 
dends and interest received directly by persons in the income class. On this 
basis nearly all fiduciary income received by upper income groups was allo- 
cated to dividends since dividends far overshadow interest in the income of 
persons in upper income groups. 

A better solution to this problem of allocation can be made with data which 
are available only for recent years on the amount of income received by fidu- 
ciaries from various sources. For the years between 1939 and 1953 for which 
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it could be done, the ratio of dividends received by fiduciaries to total income 
of fiduciaries was used to estimate the share of fiduciary income of persons in 
each income group which consisted of dividends. This was not possible in 1941, 
1942, 1943, 1950 and 1953 because the necessary data from fiduciary returns 
were not published. 
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THE USE OF SAMPLE QUASI-RANGES IN SETTING 
CONFIDENCE INTERVALS FOR THE POPULATION 
STANDARD DEVIATION* 


F. C, Lzong, Y. H. Rurenspera, anv C. W. Torpt 
Case Institute of Technology 


The problem is the determination of an optimal selection method of 
quasi-range for setting one-sided confidence bounds and confidence in- 
tervals for the standard deviation from a given distribution. The pro- 
posed methods of optimal selection are applied to randem ordered 
samples from the normal, exponential and rectangular distributions. 
Tables of confidence bounds for the standard deviation of these dis- 
tributions are given for confidence levels commonly used in statistical 
work. These are compared with the results of standard procedures. 


1. INTRODUCTION 


N A paper by Chu [2] methods were proposed for the use of sample quasi- 
| ranges in setting confidence intervals for interquantile distances: If F(z) 
is a given cumulative distribution function, or edf., (not necessarily continu- 
ous), and 0<p<1, then any &, such that 


Fe — 0) Sp < FE) 


is called a p-quantile of F(z). Given an ordered random sample 2;<2:< --- 
<z, of size n drawn from the distribution F(x), confidence intervals for the 
interquantile distance §,—£,(1>q>p>0) can be obtained of the form (z,—7., 
%,—2,), where the integers r, s, u and v satisfy the inequalities 1<r<s<n, 
1<u<v<n, u<s. It was shown in Chu [2] that 

Pr{x, — x, > & — &} > B,(s — 1, q) — Ba(r — 1, p) = L, a) 
Pr\z, — Ty < bo — tp} = _ B(v nt l, q) + B,(u me 1, Pp) = L’, 


where 


b{n 
B,(k, p) = >( *) pi — p)*i, 


i=0 


It was also shown that if n is sufficiently large, then for any a (0<a<1), there 
exists at least one set of integers r, s, u and v for which L and L’>1—a. There- 
fore, the corresponding x,—z, and z,—2, are respectively upper and lower 
confidence bounds for §,—£,, each with confidence coefficient at least 1—a, 
and (t,—2., %,—2,) is a confidence interval for §,—£, with confidence coeffi- 
cient at least 1—2a. 
Further, it was shown in [2] that if F(x) is of the form Fo[(z—a)/b] then o, 
the standard deviation of F(x), is given by 
o = (& — &)/co (2) 
* This research was supported in part by the United States Air Force through the Air Force Office of 


Scientific Research and Development. 
t Presently at Fenn College, Cleveland, Ohio. 
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where Cp is a constant which depends on F (x), p and g, but not on a and b. 
Thus, if (x7, —2.u, 7,—2,) isa confidence interval for &,— £,, then ((7,—2.)/co(v, u), 
(x,—2,)/co(s, r)) is a confidence interval for co. 

A problem now arises; when one is interested in obtaining confidence inter- 
vals for o then p and g are not given in advance. It was, therefore, proposed 
in a subsequent paper by Chu, Leone and Topp [3]: 


(i) to base the procedure on symmetrical quasi-ranges, namely to set: 


s=n-—r-+1, Ln—r41 — Le = Wp 
+1 ? (3) 
ven—util, La—upi1 — Ty = Wey 
as well as set g=1—p. Then L>1—a reduces to B,(r—1, p)<a/2, 
and L’>1—a reduces to B,(u—1, p’) >1—a/2. 
to choose p for each r(1<r< [n/2]) such that 


B,(r in 1, Pp) 9 a/2, (4) 
and to choose p’ for each u(1<u<[n/2]) such that 
B,(u — 1, p’) = 1 — a/2, (4a) 


that is, choose p and p’ such that w,/co(r) and w,/co(u) are respectively 
upper and lower confidence bounds for ¢c, each with confidence coeffi- 
cient at least 1—a. 

(iii) to choose that value of r which minimizes E[w,/co(r) | and that value 
of u which maximizes F[w./co(u) | in order to obtain the shortest con- 
fidence interval for o with confidence coefficient 1—2a. 


2. CONFIDENCE INTERVALS FOR o OF THE NORMAL DISTRIBUTION 


In Table I we present confidence bounds (based on quasi-ranges) for ¢ of the 
normal distribution which were arrived at by the procedure described in Sec- 
tion 1. These are given for confidence levels a=.005, .01, .025, .05 and .1, and 
within each value of a for sample sizes n= 10(5)100. In order to illustrate our 
method and to provide guidance for computation of values for sample sizes and 
confidence levels not included in our table, the following step-by-step procedure 
is given for the case of the normal distribution: 

Given n, a and F(x) = N(0, 1), namely the standardized normal cdf., then 
for each 1<r<[n/2]: 

(i) Find p such that B,(r—1, p) =a/2; denote this p by po.! 
(ii) Find & such that 


ae 
—_— e¢ /2qt = " 
ses. ” 


(iii) Requiring g=1—p, compute 


&b-—& 
co(r) = ~—” = 2| €,| 


70 





1 Details on this are given in Section 4. 
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TABLE I. CONFIDENCE BOUNDS FOR ¢ FROM THE 
NORMAL DISTRIBUTION 











Confidence Coefficients 





Bounds— .995 Bounds—. 990 
Intervals— .990 Intervals— .980 





Lower Upper Expected | Effective- Lower Upper Expected | Effective- 
Confidence | Confidence | Length of ness Confidence | Confidence | Length of 
Bound Bound Interval % Bound Bound Interval 





0.143652w: -046431w: |12.010824 13.44 -151970w1 | 2.233954ui -407316 
- 1940092 -13271lw, | 3.448357 33.7 - 146951: -941648u1 -759052 
-186716w:2 -773365u1 | 2.362835 | 40.28 - 1955922 -685279w1 -008850 
-181596w: -628539w: | 1.916938 | 42.97 - 189779: -571984m; -669694 
-211475us -548656u1 | 1.664301 44.22 -220170ws - 50675911 -469394 


- 2063983 -497291ui -497622 | 44.81 -214510ws -463736u1 -332763 
- 202282wa -564301ws .364838 49 -209939ws -527360w: -212082 
-223389w.s -520913ws . 253926 -28 -231306u.4 -490084.w2 . 120200 
-219317ws -488240ws - 169200 -76 -226833w4 -461665w2 -048613 
- 2360520 -462612ws . 102146 -04 -222997 ws -439162w2 -991452 


-232120ws -498121ws -045738 ° -239466ws -473500003 -940221 
- 228667 ws -475199ws - 993263 ‘ -235711ws -453126ws -895103 
- 24224 1ws -456127 wa - 949204 ° -232382ws -436070ws -857079 
- 238885 -439964ws -910946 . -245785we -421542ws -823988 
23586406 -426060ws .877861 ° -242524w6 -44488 lus -793985 


.247284w7 - 4489804 -846691 . -239572ws -431366w4 - 766481 
- 2443371 -435943u.4 -818522 ‘ -250875w7 -419516ws -741918 
-424437 uw. - 793552 f -247981w1 -409024w4 -719946 
-414192u,4 -771178 . -245322w1 -426646us -699750 
































TABLE I.—Coni’d. 








Confidence Coefficients 





Bounds—.975 Bounds— .950 
Intervals— .950 Intervals— .900 





Lower Upper Expected Lower Upper Expected 
Confidence | Confidence | Length of Confidence | Confidence | Length of 
Bound Bound Interval Bound Bound Interval 





165457 -344318w:. | 3.627952 ° -178334u | 1.000592u: . 530503 
- 1591071 -753317w1 -062993 ° -170540w: | 0.642110w: -637211 
-209592ws -586099w .599008 ° - 222563 ws -520665w -818096 
-202582ws - 5044271 .365125 ‘ -214330u2 -457416u1 144525 
-233718ws -454971 11 . 220832 ; - 208221 we: -417759wi -033960 





-227076ws -521691ws - 106108 A -238490w: -480773w2 -934741 
-221746ws -480837 ws -013138 . -232413ws -446832u2 -861544 
- 243497 ws -450636w02 - 943282 . -227406w: -421353ws -805995 
- 2383660. -427245ws .887162 . - 24873 lw, -401388w2 - 760893 
- 233983 ws -462218w3 -840123 . -243822u, -455103w3 -719524 


-250698ws -441347ws -797734 . - 239573 ws -417003w; -685080 
- 24645505 -424078ws -761589 A -256046ws -401916ws -655617 
-242704ws -409506ws - 730910 ‘ -251894w5 -389108w3 -630204 
- 2562836 -433045u. -703771 . -248195ws -412000w, -606331 
- 252639ws -41956814 -678129 > - 24487 ws -400041 uw, -585535 


-249346ws -407820u. -655671 . - 25801 1lws - 38957 lws - 566500 
- 26078571 - 39746804 -635802 ° -254731ws -407433ws -549713 
- 2575771 -415161ws -617280 ° -266071w7 -397602ws - 533728 
- 2546357 -405396ws - 599956 . -262864w7 -388810ws -519167 
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TABLE I.—Cont’d. 








Confidence Coefficient 





Bounds— .900 


Sample Intervals—.800 


Harter’s Point 
size n Wet " 





Lower Confidence | Upper Confidence | Expected Length Effectiveness 
Bound Bound of Interval 





0.194683u: -773365u1 1.780897 . 0.324939w1 
- 1848071 -548656u1 1.263222 . - 288033 wi 
- 23856 lw -461097w1 1.050576 . - 3552 14we 
-228669w2 -412803u 0.925455 . -328019w3 
-221401w: -479627 ws -834380 . -309483u2 


- 252287 ws -441081w2 - 760620 . -345394ws 
- 2452340 -413266u2 - 704954 . -329593ws 
- 239457 ws - 392067 ws -662141 . -317141w; 
- 261127 -426612us -625148 . -341592u. 
- 2555454 -407749us -592421 F -330436u. 


25073 1ws -392245us -565349 ° -821055w. 
-267436ws -879228ws - 542499 ° -839581l ws 
-262780ws -402417m% -520876 . -330956ws 
-258643.ws -890352w. -601849 , -846274ws 
- 254933 ws -379860ws -485237 : - 338338w. 





-26823 lus .397860ws -470228 . -831306ws 
-264596we -888054ws 455938 ‘ -344065w7 
-261282we -379323ws .443174 , -337494w7 
-272537w1 -371488ws -431545 . -331559w7 




















since = —#, and oo=1. 
(iv) Compute E[w,/co(r) |= E(w,)/co(r). 
Having followed the steps (i) through (iv) for all r, 
(v) Choose that r which minimizes E(w,)/co(r). 
Similarly, for each 1<u< [n/2]: 
(i) Find p’ such that B,(u—1, p’)=1—a/2, denote this p’ by p/. 
(ii) Find ¢ such that 


1 pe 
V2K J 2 


e-*/2dt = pg. 


(iii) Compute 


be — & 
co(u) = ~—— = 2 & |. 
oo 
(iv) Compute E[w./co(u) ]=E(w.)/co(u). 
(v) Choose that u which maximizes E(w,)/co(u). 
Thus, if 2, 22, - +, 2, is an ordered random sample drawn from the normal 
distribution, then r and u chosen in the manner described above determine 
(wu/co(u), w-/co(r)) as a confidence interval for ¢ with confidence coefficient 
at least 1—2a. 
The expected value E(ls) of the length of confidence intervals set by this 
method is then 
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E(lo) = of E(w,)/eo(r) — E(w.)/eo(u)}.2 
2.1 COMPARISON WITH THE CONVENTIONAL METHOD OF SETTING CONFIDENCE 
INTERVALS FOR o 
In the conventional method the statistic 
n 1/2 
&§ = [x @.- 2 
t=1 


is used for obtaining both point estimates and confidence intervals for c. Then, 
confidence intervals for o are given by (s/a, s/az), where 


2 2 2 2 
a1 = X1i~a,n-1 a2 = Xa,n—-1- 


The expected value E(l) of the length of confidence intervals is E(s)(1/a2 
—1/a,); the expected value E(s) of s is given by 


= _{rn-—l 
E(s) = ev2r(n/2)/0( 3 ) 


and, therefore 


‘ - n—1 
E(l) = yrv2r(n/2) ir( =) ava — 1/a,). (6) 


In Table I we also show the ratio E(l)/E(ly) which we call the “effective- 
ness” of the proposed confidence interval. It may be observed that for small 
samples the conventional method yields considerably shorter intervals, but as 
n increases the effectiveness of the proposed intervals is improved. This means 
that, as the sample size increases, the expected length of the confidence inter- 
vals obtained by our method decreases steadily, at slightly better than the rate 
exhibited by corresponding intervals obtained by the conventional method. 
However, the effectiveness of our confidence interval will always be below 1. 
This is due to the inefficiency of the sample quasi-ranges. 

We would like to point out that Table I may be looked upon as a companion 
to Harter’s [5] tables of point estimates for o from the normal distribution, 
also based on the use of quasi-ranges. We include Harter’s point estimates for 
n=10(5)100 in Table I for the sake of illustration. It may be noted that, at 
least for the values appearing in Table I, the expected value of Harter’s point 
estimate is always between the expected confidence bounds proposed by us. 
This seems of interest if we recall that Harter’s point estimates were arrived 
at from considerations based on the distribution of order statistics of the nor- 
mal edf., while our confidence bounds were obtained by first applying distribu- 
tion-free methods and then imposing the distribution. 


3. CONFIDENCE INTERVALS FOR o OF NON-NORMAL DISTRIBUTIONS 
3.1 THE EXPONENTIAL DISTRIBUTION 


In Table II we present confidence bounds of the shortest interval for @ of 
the exponential distribution. The procedure by which the values in Table II 





2 For N(0, 1) ¢ will be deleted from (5). This value of E(J)o is tabulated in Table I. 
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TABLE II. CONFIDENCE BOUNDS FOR o FROM THE 
EXPONENTIAL DISTRIBUTION 








Confidence Coefficients 





Bounds— .995 
Intervals—.990 


Bounds— .990 


Sample Intervals-— .980 


size n 





Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 


Effective- 


ness 
% 


Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 


Effective- 
ness 


% 








. 2053032 
- 188811: 
.234124u; 
. 2219451, 
- 254231, 


- 244099, 
- 269172 
- 2604105 
- 2808925 
. 273097 us 


- 2904087 
- 28334207 
- 298337 ws 
-291897 ws 
-805101we 


-299171lws 
-311001wio 
- 305462010 
-3161390n 





5 .056552u1 
1.405758u1 
0.950685w: 
-765510wi 
-880459w2 


-761272ws 
-6826 162 
-740035ws 
-679711ws 
-6335630s 


-671200ws 
-632921u,s 
-601373 ws 
-629447 ws 
-601995ws 


.578463ws 
-600780we 
. 5797 26we 
.561171ws 





-952144 
- 159277 
-918687 
-404293 
2.063432 


-815884 
-647994 
- 520236 
-410072 
.325107 


- 252821 
-189718 
- 136358 
-089516 
-046855 


.010039 
-976057 
- 945319 
-917797 


15.70 
38 .87 
45.66 
48.14 
50.21 


2.09 
.14 
3.87 
-73 
-25 


-69 
15 
-46 


ff 


-04 











.221600w: 
-202505we 
. 248792ws 
- 2349853 
- 267778. 


. 25655204 
-281886ws 
- 272254; 
. 292921 ws 
.284432we 


.301861w7 
.294250w7 
.309341ws 
.302424ws 
.315700ws 


. 30935309 
-3212191w10 
-315337wi0 
.326078w1 





-788542u1 
- 1643381 
-838249u1 
-692689u1 
-800149w2 


. 7015582 
-634910w: 
-690023w; 
-637754us 
. 597224; 


-634006u4 
- 599889u%,s 
-571644w.4 
-599126ws 
. 57434405 


- 596827 we 
.574918we 
.555695ws 
. 57433307 





7.508018 
3.344429 
2.491345 
- 100755 
-800970 


-604119 
-467769 
.352863 
- 261672 
. 189816 


- 125362 
-071193 
-025601 
- 983026 
-946368 


-913774 
-883606 
-856626 





25.16 
42.37 
47.23 
48.87 
51.17 


57 
.29 
-12 
-75 
13 


-60 
04 
.14 
-46 
67 


-82 
-03 
-20 
31 
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Confidence Coefficients 





.975 
Intervals— .950 


Bounds—.950 
Intervals— .900 








Lower 
Confidence 
Bound 
0 .248118w2 
.224370w2 
.271935ws 
. 255488ws; 
. 2889081, 


.275818w4 
-301543 us 
-290516ws 
-311436we 
-301836we 


.319459w7 
-310916w7 
.303451w7 
.318446ws 
-311591ws 


.324815ws 
.318487ws 
-330320ui0 
-324406wi0 





Upper 
Confidence 
Bound 
1.672285 
0.925139u1 
-710898u1 
- 825281 we 
. 70398802 


-627592we 
-683827 ws 
-626844u. 
58385403 
.620958w, 


-585514w, 
- 556570, 
-584168ws 
.559073ws 
- 581624. 


- 559578we 
- 540389ws 
-559021w7 
-541730w7 





Expected 
Length of 
Interval 
4.304610 
2.518991 
1.994650 
1.696833 
1.476787 


1.333384 
1.225198 
1.134184 
1.064630 
1.004777 


-953566 
-910571 
-871858 
-837709 
-808276 


- 780369 
-755838 
- 733534 
-712726 


Effective- 
ness 

% 
34.84 
45.58 
48.29 
49.81 
51.59 

43 


o1 
-71 
-06 
41 
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Lower 
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Bound 
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Bound 


Expected 
Length of 
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0.167292w: 
- 244903 v2 
- 228213 w: 
-274373ws 
- 260665: 


-293379w.s 
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- 296807 ws 
-317493 we 


-308384ws 
-325850w7 
-317630w7 
.332750ws 
.325251ws 


. 338581» 
-331680ws 
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-337193wi0 





-238920u1 
- 7829351 
-62626401 
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‘ 636798we 


. 5743462 
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- 580387 ws 
-618050u, 
. 579728 


. 548971. 
.576849ws 
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- 528398ws 
- 5505 15ws 


. 53084316 
-549550w1 
-531974w7 
- 5163887 





3.031599 
2.011839 
1.652406 
1.402841 
1.240646 


1.130392 
1.038194 
0.966605 
- 910537 
-859674 


-818270 
-782160 
- 749752 
-721947 
-696265 


-673416 
-652905 
-633738 
-616370 
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TABLE II.—Coni’d. 








Confidence Coefficient 





Bounds— .900 ‘ . 
Sample Intervals—.800 Harter’s Point 


size n Estimates 





Lower Confidence | Upper Confidence | Expected Length Effectiveness 
Bound Bound of Interval % 





- 1897461 -950685u1 2.152673 41.97 -582121w: 
-270576w2 -662540u1 1.564399 45.46 -458687 ws 
-250279ws -776094w3 1.311965 -09 - 515583 ws 
-297484ws -648149u: 1.120498 -75 -551869ws 
-281377ws - 57284502 1.003916 31 -494783 4 


- 314639, -628271w; 0.917140 -71 -523107ws 
-801186ws -574289u -846345 . -544996ws 
-326861 ws -534349w, - 792938 36 -507560ws 
-315265ws - 570885. - 746026 64 -526316w71 
-33618lws - 538335w. - 707477 -93 -541974us 


-325958wes -511937w. -674688 
-817138ws -539146ws -645166 
-334451lw7 - 5163355 -619713 
-826393w1 - 538547 ws - 596987 
-34141lwus -518558ws - 576379 


03 -514089ws 
12 - 528091 we 
29 - 505457 we 
38 -518052wie 
49 -529217wu 


-833999ws -501192ws . 558237 
-347256we -519560w7 -541166 
-340388we .503924w7 . 525822 
-352259w10 -519852ws -511795 


-51 -5104500n 
59 - 520713 wis 
.64 - 52999513 
-613967wu 


SESS SESEE 




















were computed is the same as for the normal case. However, the computations 
are much easier. For F(z) =1—e~* we have 


co(r) = log. [( = Po)/pol, 
co(u) = log. [(1 — pd )/pé ]. 


The expected value of a symmetrical quasi-range for a sample drawn from 
the exponential distribution is given by 


E(w) =O 1/k. (7) 


tek 


Therefore, we choose r such that 
E(w,)/log. [(1 — po)/po] is minimized, 
and choose u such that 
E(w.) /log. [(1 — pé)/pé ] is maximized. 
3.1.1 COMPARISON WITH THE CONVENTIONAL METHOD 


For the exponential distribution with mean and standard deviation each 
equal to 1/A, the sample mean £ is the efficient estimate of 1/A. Since u=2An# 
has the x? distribution with 2n degrees of freedom, the conventional confidence 
interval for 1/A with confidence level 1 —2a is 


(2n8/x1~-a,20, 2n¥/xXa,2n). 
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The expected length E(J) of the confidence intervals is, therefore, 


E(l) = (2n/d){1/xa,28 — 1/x1-a,20}- (8) 


The ratio E(l)/E(lo) is also presented in Table II, as is Harter’s point estimate.* 

Also here, as in Table I, the expected value of Harter’s point estimate is 

always between the expected confidence bounds shown in Table II. This is 

easily tested by substituting #(w,) for w;, in both bounds and the point estimate. 
3.2 THE RECTANGULAR DISTRIBUTION 


A rectangular distribution with mean zero and variance one has the prob- 
ability density function 
fa)=—=; -V3sesvi 
2/3 
Here the computations of the confidence bounds are straightforward: 
co(r) = 2+/3(1 — 2po) 
| Co(u) = 2/3(1 — 2p?). 
Since the expected value E(w,) of a symmetrical quasi-range is 
n—2r+1 
E(w,) = iy as 2/3 
we should choose r such that 
n—2r+1 


ne \/a — 2po) is minimized, (10) 


E(w,)/eo(r) = ( 


and choose u such that 
n—2u+1 
n+l 
Now, it can be shown that for large n, L= E(wx)/co(ic) is an increasing func- 
tion of k if a<.5, and a decreasing function of k if a>.5: Write 
aL oL ok 


dk op / ap 


) f (1 — Spé) is maximised. (11) 


E(w,)/ceo(u) = ( 


For large n 


(k — 1 — np)/Vnp(l — p) = ta, 
where 
(ca) = a, 


#(x) = f ; setae 


7 





* Point estimates of the population standard deviation for the exponential distribution are not included in 
Harter [5]. They appear in an earlier version of the same paper (see reference [4]). 
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dk Vn (1 — 2p) 


op 


Pet pie —____+__— [(1 — 2p) + 2p*] 20 
dp (1 — 2p)*V/np(1 — p) 


Ca S$ O, or if a § .5. 


We therefore have for the rectangular distribution and large n that (10) is 
minimized for r=1 and (11) is maximized for u=1. Our calculations show that 
the same result holds for small values of n as well. 

A general expression for E{lo}, the expected length of the confidence inter- 
val, can now be easily derived. For r=u=1, we have from equations (4) and 
(4a) that po=1—(a/2)"", pf =1—(1—a/2)/*, and from equations (10) and 
(11) we obtain 


-—1 
E{l} = *—— { [2(a/2)1" — 1]-1 — [2(1 — @®/2)"™ — 1}. 
n+1 


Table III presents the confidence bounds, the expected lengths of the confi- 
dence intervals and Harter’s point estimate for the standard deviation of the 
rectangular distribution. The effectiveness ratio is omitted in this case, since 
the determination of E{1} would involve a considerable computational effort 
outside the scope of this study. It should be mentioned that the bounds and 
point estimates are based on the same statistic w, (the sample range) and that 
the point estimates fall betweer the confidence bounds given in Table IIT. This 
might indicate an effectiveness ratio close to 1 for medium and large sample 
sizes. 


4. DISCUSSION 


So far our main concern has been the presentation of the proposed method of 
using sample quasi-ranges for estimating the population standard deviation 
and its application to commonly used distributions. A few words should now 
be devoted to the evaluation of its advantages and limitations relative to the 
conventional method. Three evaluation criteria come to mind in this connec- 
tion: 


(a) if a sample of size n is taken, how does the expected length of the confi- 
dence interval obtained by the proposed method compare to the one 
obtained by the conventional method; this is the “effectiveness” cri- 
terion introduced previously; 
if it is desired to obtain a confidence interval of a given expected length, 
say d, how large should n be using the proposed method as compared to 
the conventional method; 





TABLE III. CONFIDENCE BOUNDS FOR ¢ FROM THE 
RECTANGULAR DISTRIBUTION 








Confidence Coefficients 





Bounds—.995 
Intervals— .990 


Bounds— .990 
Intervals— .980 





Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 


Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 








0.288820u:* 
0.288772 
0.288747 

. 288733 

- 288723 


- 288716 
- 288711 
- 288707 
- 288704 
- 288701 


- 288699 
- 288697 
- 288696 
- 288694 
- 288693 


- 288692 
- 288691 
- 288690 
- 288690 





2.928910w:* 

0.845555 

0.598577 
-503100 
-452520 


.421220 
.899947 
- 884552 
.872896 
-863765 


.356419 
-350381 
-845331 
-3841045 
.337361 


-334162 
-331357 
- 828877 
-326670 





-482715 
-687659 
-971065 
-685467 
. 530803 


-425557 
364311 
.317581 
- 280212 
- 250743 


- 226898 
- 207205 
- 190663 
. 176577 
164428 


- 153850 
. 144551 
- 136312 
- 128961 





0.288965w:* 


. 288868 
- 288820 
. 288791 
. 288772 


- 288758 
- 288747 
- 288739 
- 288733 
- 288728 


- 288723 
- 288720 
- 288716 
- 288714 
- 288711 


- 288709 
. 288707 
- 288706 
- 288704 





1 .627182wi* 
0.713053 
- 540043 
-467083 
- 426900 


-401472 
- 383938 
-871118 
-361336 
-353629 


-847398 
-342258 
-3837945 
-334275 
- 331113 


-328361 
-325944 
- 323805 
-321898 





3.792862 
1.285742 
0.787380 
0.570112 

-447619 


-368761 
- 313666 
- 272962 
- 241641 
-216794 


- 196592 
- 179841 
- 165731 
- 153675 
- 143258 


- 134164 
- 126158 
- 119053 
.112710 





* w: (the sample range) is to be vsed with all of the constants in the table. 


TABLE III.—Coni'd. 








Confidence Coefficients 





Bounds—.975 
Intervals—.950 


Bounds—.950 
Intervals—-.900 





Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 


Lower 
Confidence 
Bound 


Upper 
Confidence 
Bound 


Expected 
Length of 
Interval 





10 
15 
20 
25 
30 


35 
40 
45 
50 
55 


60 
65 
70 
75 
80 


85 
90 
95 
100 





0 .289403101* 
- 289160 
- 289039 
- 288966 
- 288917 


- 288883 
- 288857 
- 288837 
- 288820 
- 288807 


- 288796 
- 288787 
. 288779 
- 288772 
- 288766 


- 288761 
- 288756 
- 288752 
- 288748 





0 .994095w1* 
- 585150 
-475984 
.425497 
-396424 


-877531 
-364271 
-354453 
- 346890 
- 340887 


- 336005 
-331958 
. 328549 
-325637 
-323121 


-320926 
.318994 
-317280 





-315750 


1.997284 
0.897172 
-585921 
-436576 
. 348388 


. 290025 
. 248498 
.217418 
- 193272 
. 173967 


- 158175 
. 145017 
. 133887 
- 124343 
- 116071 


- 108832 
- 102445 
-096765 
-091685 








0.290142w:* 
- 289652 
. 289407 
. 289261 
- 289163 


- 289093 
- 289041 
- 289000 
- 288968 
288941 


- 288919 
- 288900 
- 288884 
- 288870 
- 288858 


-288847 
- 288838 
- 288829 
. 288821 





0.753710w1* 
-511871 
-435320 
- 397827 
-375589 


- 360873 
-350417 
-342605 
-336547 
-331712 


- 327764 
-324479 
-321704 
-319327 
.317270 


-315471 
-313886 
.312477 
.311218 





1.313875 
0.673566 
-457319 
-347154 
- 280073 


- 234839 
- 202241 
- 177620 
- 158355 
. 142872 


-130151 
-119514 
- 110489 
- 102730 
-095992 


-090083 
-084862 
-080212 
-076049 





* w; (the sample range) is to be used with all of the constants in the table. 
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TABLE III.—Conl'd. 








Confidence Coefficients 





Bounds— .900 
Intervals— .800 





Lower Confidence 


Bound 


Upper Confidence 
Bound 


Expected Length 
of Interval 


Harter’s Point 
Estimate 





Sases 2SSSE 


85 
90 


100 





0.291660w.* 
- 290660 
- 290162 
- 289863 
- 289665 


- 289523 
-289417 
- 289334 
- 289268 
. 289214 


- 289169 
-289131 
- 289099 
-289070 
. 289046 


- 289024 
- 289004 
- 288987 
- 288972 





0.598577 w.* 
-452520 
-399947 
. 372896 
-356419 


-345331 
-337361 
-331357 
. 326670 
-322911 


-319828 
-317255 
-315074 
-313203 
-311579 


-310157 
-308901 
.307784 
- 306783 





0.869884 
-490612 
. 344087 
- 265509 
- 216324 


- 182584 
- 157981 
139243 
- 124483 
-112561 


-102724 
-094472 
-087445 
-081399 
-076129 


-071504 
-067410 
-063758 
-060477 





-352825w:* 
-329914 
-319062 
-312731 
- 308584 


-805656 
-303479 
-801797 
-300458 
- 299367 


-298461 
-297696 
- 297043 
-296477 
-295983 


-295548 
-295162 
-294817 
+294507 





* w; (the sample range) is to be used with all of the constants in the table. 


(c) for a given n, how does the time required to compute the confidence 
interval by our method compare to the time required by the conven- 
tional method? 


The answer to the first question is readily obtained from our tables for the 
normal and exponential distributions. For sample sizes between 10 and 100 our 
intervals are roughly 3 to 2 times longer than the ones obtained by the conven- 
tional approach. The same ratios apply to the second question: our samples 
have to be roughly 3 to 2 times larger to yield a confidence interval of a given 
expected size. The principal advantage of our method is in terms of computa- 
tion time. With the aid of the tables, all that is required is the ordering of a few 
lowest and a few highest sample values and calculating two differences and two 
products. 


5. SOME REMARKS ABOUT COMPUTATIONS 
5.1 PERCENTILES OF THE BINOMIAL DISTRIBUTION 
The solution for p of equations (4) and (4a) of the form 


Bn(k, p) = « 


has been obtained by Leone, Haynam, Chu and Topp [6] who applied repeat- 
edly Newton’s method to the equation 


B,(k, p) -a=0. 
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The values of p are tabulated in ref. [6] for the following values of a, n and k: 


a: .0025, .0050, .0100, .0250, .0500 and .1000, as well as their one’s complements; 
= 10(5)100; O<sk<& [n/2]. 


These p-values are accurate to 7 decimal places. For values of n(n<119) not 
in ref. [6] one can refer to tables of the percentiles of the Beta Distribution [7]. 


5.2 EXPECTED VALUES OF QUASI-RANGES FROM THE NORMAL DISTRIBUTION 


Our method requires the calculation of expected values of quasi-ranges, 
namely values of E(w,,-) and E(w,,.). Simple expressions can be obtained 
for E(w.) from the exponential and rectangular distributions and are given 
in (7) and (9). This is not the case with the normal distribution, where 


n co) 
Btwn) = H() f“lr@} bt - Pe -yerae 
has to be evaluated numerically. 

Fortunately, partial tables of E(w.) and the related quantity E(z,.) = 
4E(w,.) from the normal distribution have been published. Expected values 
of the range, namely E(w,,1) for n=2(1)1000 have been calculated by Tippett 
[9] to five decimal places. Cadwell [1] calculated E(w,,2) to four decimal places 
for n=10(1)30. Teichroew [8] has tabulated E(z,,,) accurate to ten decimal 
places for n=2(1)20, k=1(1)[n/2]. Harter [5] has recently published exten- 
sive tables of E(w,.), accurate to within 1 in the sixth decimal place for 
n=2(1)100, k=1(1)s, s=min([n/2], 8). 

In order to compute our Table I, we have extended Harter’s tabulation to 
include values of E(wa,) for n=25(5)100, k=9(1)[n/2], using essentially 
Harter’s method of numerical integration and obtaining results accurate to 
five decimal places. Table I is based on the following values of E(w,,x): 








n k Source Accuracy 





10, 15, 20 1(1) [n/2] Teichroew [8] 10 decimal places 
25(5)100 1(1)8 Harter [5] 1 within 6th dec. place 
25(5) 100 9(1) [n/2] Leone, et al 5 decimal places 











APPENDIX 
TABLES OF CONFIDENCE BOUNDS 
In this Appendix we present 
Table I: Confidence bounds for ¢ from the normal distribution. 
Table II: Confidence bounds for ¢ from the exponential distribution. 
Table III: Confidence bounds for ¢ from the rectangular distribution. 
Some simple examples of the use of these tables follow: 


Example 1: 


Find the 95% upper confidence bound (UCB) for o from the normal dis- 
tribution using a sample of size 25. 
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We have from Table I, for 1—a=.95 and n=25 that UCB=.457416u, 

=> 457416 (a5 — 21). 
Example 2: 

Find the 95% confidence interval (CI) for o from the normal distribution 
using a sample of size 30. 

We have 1—2a=.95 and 1—a=.975. From Table I, for 1—a=.975 and 
n= 30: 
CI = UCB — LCB = .454971w,; — .233718w; 

454971 (x30 — x1) ae .2337 18 (x28 —_ 2X3). 


Example 3: 


Find the 90% lower confidence bound (LCB) for o from the rectangular 
distribution using a sample of size 50. 

We have from Table III, for 1—a=.9 and n=50 that LCB =.289268w,. 
(All confidence bounds for the rectangular distribution are in terms of 
the range w). 
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THE STATISTICAL WORK OF OSKAR ANDERSON! 


GERHARD TINTNER 
Iowa. State University 


HE death of Professor Oskar Anderson in Munich, Germany on February 

12, 1960 in his 73rd year is a great loss for statisticians everywhere. Pro- 
fessor Anderson was born on August 2, 1887 in Minsk, Russia. He studied 
mathematics, physics, economics, and law at the Universities of Kasan and St. 
Petersburg (Leningrad). He was an assistant of the well known Russian 
statistician A. A. Tschuprow. He taught mathematical statistics at the High 
School of Commerce in Kasan, later at the High School of Commerce in Varna 
(Bulgaria), the University of Kiel, and the University of Munich. Apart from 
his teaching activities he participated actively in sampling censuses in Russia 
and Bulgaria. 

He was perhaps the most widely known statistician in Central Europe. Dur- 
ing his very distinguished career, he provided a link between the Russian 
school of statistics, from which he originated (Markoff, Tschuprow) and the 
Anglo-American school. He was an honorary doctor of the University of 
Vienna, also of the High School of Economies, Mannheim (Germany), an 
honorary member of the Royal Statistical Society, and of the German Statisti- 
cal Society, fellow and founder of the Econometric Society, member of the 
International Statistical Institute, fellow of the American Statistical Society, 
the Institute of Mathematical Statistics, and the American Society for the 
Advancement of Science. 

Through his origin in the flourishing Russian school of “probabilistes,” who 
are certainly today the most important contributors in this field (Kolmogoroff, 
Khintchin, Gnedenko, etc.), Anderson belongs to the so-called “continental” 
school of statistics, and worked in the tradition of the well known German 
statisticians Lexis and von Bortkiewich. He might be the iast representative 
of this approach, since recent textbooks in statistics published in German 
show already the overwhelming influence of the Anglo-American school of Sir 
Ronald Fisher and Jerzy Neyman. 

It is, of course, impossible to isolate “schools” of statistics in this manner, as 
actually the members of the various tendencies in statistics were in contact 
with each other. Anderson himself appreciated the achievements of the Anglo- 
American school to a certain extent, and was also somewhat critical about the 
applicability of these methods in the field of the social sciences. 

We propose to discuss the works of Anderson under seven headings: (1) bio- 
graphical and bibliographical, (2) probability, (3) survey sampling, (4) Variate 
Difference Method, (5) time series analysis, (6) econometrics, (7) index numbers, 
(8) correlation methods, (9) textbooks. 

1 I am much obliged to the son of Professor Oskar Anderson, Professor Oskar Anderson, Jr. of Handelshoch- 
schule, Mannheim, Germany, for supplying the bibliography. I have also profited from reading the obituaries by 
S. Sagoroff (“Nachruf fuer Oskar Anderson,” Metrika, 3 (1960), pp. 89 ff.) and H. Strecker (“Im Gedenken an 


Oskar Anderson,” Schweizerische Zeitschrift fuer Volkswirtschaft und Statistik, 96 (1960), pp. 238 ff.). Journal Paper 
No. J-4004 of the Iowa Agricultural and Home Economics Experiment Station, Ames, Iowa, Project No. 1200. 
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1. BIOGRAPHICAL AND BIBLIOGRAPHICAL 


This list includes an account of Anderson’s teacher Tschuprow [1926]? and 
of the work of L. von Bortkiewicz [1931¢], which is remarkable for the treat- 
ment of the contributions of this outstanding scholar to statistics and eco- 
nomics. The bibliography [1955a] is very complete and indispensable for the 
study of German statistics after the last war. It is to be hoped that this bibliog- 
raphy will be continued. 


2. PROBABILITY 


The point of view taken by Anderson about the interpretation of probability 
is most original and worth noting ([1930], [1931b], [1947], [1948], [1949a], 
[1949¢], [1951b], [1957b], [1958]). The origin of his idea can, of course, be 
traced to the Russian school of probability. Very remarkable is the definition 
of a “social-statistical probability” ({1957c] p. 100): “Probability of an at- 
tribute or a characteristic of a social statistical population (Gesamtheit) is the 
frequency in a population of higher order, out of which the present population 
has been taken. It is necessary to be precise about the way in which the given 
population has been derived from the higher population, as much as practical 
applications of probability theory are concerned. The population of higher 
order could be finite or infinite, depending on the concrete circumstances.” 

In statistical inference the point of view of Anderson is somewhat different 
from the accepted statistical text books. He bases himself essentially upon cer- 
tain ideas of the great French mathematician and economist A. Cournot in 
connection with the law of large numbers: The connection between the purely 
mathematical theorems, like the law of large numbers, and practical statistical 
applications is established by the “Cournot bridge.” This consists of three 
parts: 

1, Events whose probability is very small, happen very infrequently. This is 
a purely empirical proposition. 

2. Consider the deviation of relative frequency from the corresponding prob- 
ability, and in general the probability of deviations of certain characteristics 
of the sample (e.g. the arithmetic mean) from the corresponding expectation 
values. The probability that such a deviation will be greater than a given mag- 
nitude, which is fixed in advance, will be the smaller, the larger the number of 
observations. 

From these two lemmas follows the theorem: 

3. If only the number of observations is sufficiently large, the deviations can 
be expected to be frequently very small ([1957c] p. 106). 

These considerations provide a justification for the application of Bernouilli’s 
theorem, Tchebycheff’s inequality etc. in applied statistics. 


3. SAMPLE SURVEYS 


Professor Anderson must be counted among the pioneers of modern sample 
surveys ([1928a], [1928b], [1929a], [1929d], [1949/50a]). He participated in 
Russia in 1913-17 in a representative sampling survey of agriculture in Russian 





* Square brackets refer to the Bibliography of Oskar Anderson. 
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Turkestan. He also was very influential in the preparation of the Bulgarian 
agricultural sample census of 1926. In the same country he started in 1936 a 
yearly sample census of agricultural acreage and production. In his many 
theoretical contributions to the subject he stresses the point of view that the 
sample census must be based upon a probability model. The level of tolerance 
and the desired accuracy of results should be fixed in advance. 


4. THE VARIATE DIFFERENCE METHOD 


The work of Professor Anderson on the Variate Difference Method is per- 
haps best known in America. His own work follows some articles by Poynting, 
Hooker, Cave, and March, and is contemporary with the work of “Student” 
(W. S. Gosset) ({1914], [1923], [1925], [1927a]). His main contribution is the 
German monograph [1929c]. An account of the earlier history and the con- 
tributions of Anderson may be found in my own monograph.’ 

The Variate Difference Method is an attempt to deal with time series while 
making a minimum of assumptions. We assume that the series consists of a 
“smooth” systematic part (trend and long cycles, business cycles) and super- 
imposed independent random errors. These errors are not autocorrelated. Then 
we can completely eliminate the smooth part of the series by taking finite dif- 
ferences if the smooth part is e. polynomial; if it is a “well behaved” function 
of time we can at least reduce the systematic component indefinitely by com- 
puting differences. After taking enough differences we are left with the random 
component alone, or at least with a series which contains only insignificant re- 
mainders of the systematic part of the time series. 

The problem is how many differences we ought to take. If our task is done 
in the k-th difference series, then this series and the series of all higher differ- 
ences will contain only the random part. Anderson worked out formulae which 
permit the comparison of the variances of two consecutive difference series. 
His results were later improved by Zaycoff, one of his Bulgarian students. I 
have myself proposed a somewhat inefficient method in this field, based upon 
the assumption of normality of the error component and the principle of selec- 
tion. This procedure utilizes only a part of the available differences, at a time. 
Later I succeeded to find the exact small sample distribution of the variances 
of Variate Differences, if the errors are normally distributed and we deal with a 
circular universe.‘ This work is being continued by Dr. J. N. K. Rao and me. 

Criticism of the Variate Difference Method was offered by Bowley, Persons, 
Fisher, Bartlett, M. G. Kendall, and Wald. They pointed out that higher dif- 
ferences are not likely to be very accurate, that the existence of autocorrelation 
makes the method inapplicable, as does the appearance of short periodic fluctua- 
tions (seasonal movement). No doubt this criticism is sometimes justified. 
Nevertheless, in spite of the great progress made in the field of statistical 
analysis of stationary time series the Variate Difference Method offers still a 
possible treatment of evolutionary statistical series, i.e. series which contain a 





: The Variate Difference Method. Bloomington, Indiana, 1940. Econometrics, New York, 1952, pp. 
308 ff. 
4G. Tintner: “The distribution of the variances of variate differences in the circular case.” Metron, 17 (1955), 
pp. 3 ff. 
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trend. Since this is always the case with empirical economic time series, the 
method has not yet lost practical interest for econometricians. 

In practical econometric work the idea of working with first differences 
which is very popular in applied econometrics, may be considered as an adapta- 
tion of the Variate Difference Method.‘ It has been the practical experience of 
most workers in the field that for short yearly series used in econometric work 
the linear trend contributes most of the systematic variation. Hence the use of 
first differences which eliminates a linear trend (or an exponential trend, if the 
data are logarithms) will sometimes greatly reduce the autocorrelation of the 
original time series. 

Apart from these applications, the Variate Difference Method did not be- 
come very popular among practicing econometricians. But work in this field, 
essentially based upon the fundamental ideas of Anderson, is still continuing.® 


5. TIME SERIES ANALYSIS 


This work of Anderson’s is closely related to his work on the Variate Differ- 
ence Method ([{1927b], [1927c], [1929c]). Most outstanding among his papers 
is his devastating criticism of the Harvard method of analysis of economic time 
series ({1929b]). This publication contributed much to the replacement of 
these methods by more efficient ones. 


6. ECONOMETRICS 


Anderson was among the founders of the econometric society and also a fel- 
low of this society. He must be counted among the most important contributors 
of this science during the early period [1931a], [1935a], [1935b], [1935d], 
[1936a], [1936b], [1938a], [1939a], [1939b], [1941a], [1942b], [1945], 
[1949b], [1951a]). Among his contributions we mention only; An effort to 
verify statistically the quantity theory of money ((1931la]), which is of 
special interest now because of the resurrection of the quantity theory by the 
Chicago school of economists.? His work on the “scissor” problem, i.e. the 
divergent movement of agricultural and industrial prices ({1935b], [1935d], 
[1941a]), should be of special interest for agricultural economists. His publica- 
tions on Bulgarian economics are very important sources for the economic 
history of this country during the period between the two world wars ([{1933], 
[1938a], [1939a], [1939b]). His interesting review of the famous book of von 
Neumann and Morgenstern on game theory ([1949b]) has contributed much 
to acquainting the German speaking world with these new ideas. 


7. INDEX NUMBERS 


Professor Anderson’s work includes many contributions to the theory of 
index numbers ([1936c], [1937], [1938b], [1941b], [1949f], [1949/50b], 


* G. Tintner, Econometrics, op. cit. pp. 325 ff. 

* See e.g. A. R. Kamat: Distribution theory of two estimates for standard deviation based on second variate 
differences. Biometrika, 41 (1954), pp. 1 ff. Contributions to the theory of statistics based upon first and second 
successive differences, Metron, 19 (1958), pp. 97 ff. P. G. Moore: The properties of the mean square successive 
difference in samples of various populations. Journal of the American Statistical Association, 50 (1955), pp. 344 ff. 
A. P. Moore and F, E. Grubbs: The estimation of dispersion from successive differences, Annals of Mathematical 
Statistics, 18 (1957), pp. 194 ff. J. N. K. Rao: A note on mean square successive differences. Journal of the American 
Statistical Association, 54 (1959), pp. 801 ff. 

7M. Friedman. A program for monetary stability. New York, 1959. M. Friedman ed. Studies in the quantitut 
theory of money, Chicago, 1956. 
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[1950b], [1950c], [1952]). He was most interested in the practical construction 
of index numbers of production and cost of living index numbers. The problem 
of chain index numbers also was treated by him ({1952]). 

On the whole he was sceptical to the ideas which are now generally treated 
under the heading of the aggregation problem and also to Wald’s ideas on cost 
of living index numbers. 


8. CORRELATION 


His work on statistics included some remarkable contributions on the gen- 
eral problem of correlation, which are always much influenced by the ideas of 
Tschuprow in this field ({1914], [1927a], [1929¢], [1931b], [1953a], [1955b], 
[1955¢], [1956b]). 

Anderson feels very strongly that the usual theory of correlation and regres- 
sion, as presented in the statistical textbooks, is not applicable in the social 
sciences because the underlying populations are not normal. Hence he has de- 
veloped non-parametric (distribution-free) methods for tests of significance of 
correlation coefficients and autocorrelation coefficients ([1955b], [1955e], 
[1956b]). This approach deserves certainly the attention of theoretical statisti- 
cians and econometricians. 


9, TEXTBOOKS 


Not the least contribution of Professor Anderson to theoretical statistics are 
his textbooks ({1932], [1935c], [1957c]). The reader can find in them a very 


clear presentation of the fundamental ideas of statistics with a minimum of 
mathematics. The German textbooks have had a great influence in Germany 
and the German speaking countries. It would certainly be most desirable if the 
last edition of his latest textbook ([1957c}) could be translated into English. 
It is an excellent presentation of statistical methodology for the use of workers 
in the social sciences. Because of the difference of some of Anderson’s ideas 
about statistics from the prevailing Anglo-American school which is very well 
brought out in this text, a translation might stimulate discussion on these prob- 
lems. 
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A PROBLEM CONCERNED WITH WEIGHTING 
OF DISTRIBUTIONS 


CoLerIpGE A. WILKINS 
The University of New South Wales 


The problem discussed arises in situations where one wishes to choose 
weights which minimise the deviation of the weighted sum of n dis- 
tributions on the interval (O, L) from their lineal average. It is as- 
sumed the distributions differ only in their means which are evenly 
spaced over (O, L), that nF is not less than L where R is the length of 
the range of each distribution and that each point in (O, L) is covered 
by at most two of the ranges. A general solution is found from which it 
follows that the minimising weights are positive and obey certain in- 
equalities. A table of results is given for the case where each distribution 
is a truncated normal distribution. The general problem is introduced 
by considering the practical question of how to obtain an even linear 
spread of spray. 


1. THE PROBLEM 


et AB be a continuous line of objects which is to be sprayed by machines 
L placed at the points S;,7=1, - - - , n. For instance, the “machines” could 
be point sources of particles and AB a thin strip of material to be irradiated, 
or they could be agricultural spraying machines and AB a row of thickly grow- 
ing plants. 

The spraying machines, point sources, ete., will be called the components, 
and the fluid or particles discharged wil! be called the fluid. 

The operator will have a definite amount, M, of fluid to be spread along AB, 
so if L is the length of AB, the average density will be M/L. It may be safely 
assumed that the operator would like the actual density curve along AB to 
approximate 


Mo 
(If AB were a row of plants, and the actual density curve had large deviations 
from this ideal curve, then the resulting maturation of the crop could be exces- 
sively uneven.) 

Although the symbols S;, above, denote positions of different components, 
they may also denote different positions of the same component, should the 
operator decide to use only one machine for the whole task. 

Take the origin at A and the z-axis along AB. The rectangular cartesian 
coordinates of S; will be denoted by (z;, y;). 

The relative density curve of the fluid deposited on AB by any given com- 
ponent will have the form of a probability distribution with a finite range. This 
distribution will generally be symmetric about the corresponding 2z,, which 
will then be the mean of the distribution. It will be assumed that only one 
type of component is being used, so the only parameter of the distribution that 
will vary from component to component will be the mean. (In practice, only 
one type of component is used; however, if it is necessary to have different 
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types, the technique discussed below could still be used by solving the relevant 
equations numerically.) 

We assume further that the operator wishes n, the number of components, 
to be a minimum—for instance, to save the cost of machines, or in the case 
where one machine is used in n different positions, to minimise the labour of 
shifting the machine. Each component will then be drawn back from AB till 
it is spraying a maximum range, ??, on AB. Then we will have 


Yi = Yi t= 1, 


tl 
— 1+ 1, 
R 


| [= | J 
(LRI’ R 
where [L/R] is the integral part of L/R. 
To avoid wastage of the fluid, z; and z, must have the values 
R 


=, t%=L-—>; 


— 


and n will be given by 


otherwise some of the fluid will fall outside AB. The remaining (n—2) com- 
ponents will be arranged evenly over (R/2, L—R/2) so that z;,7=1, ---,n, 


will be given by 
L zd R - 
RY ich 2 VS 
n—- | 


Let ¢;(x) denote the distribution function of the fluid from the i-th com- 
ponent. From the assumptions concerning range and symmetry, 


oi(x) = o(22; — 2) 


and 


R R 
(x) = 0, r<uy-—> r>a+—-: 
2 2 


If D(x) is the actual density after spraying is completed, then 
D(x) = Lo dilx)m 
1 


where m, is the total amount of fluid deposited along AB by the component 
S;. The operator would like y= D(x) to approximate y= M/L when 0<2<L. 
If a least-square measure for the deviation is adopted, the problem is to mini- 


mise 
L M\? 
S = f (De) - —) dz 





WEIGHTING OF DISTRIBUTIONS 


with regard to the m,, subject to the condition that 


> m; = M. 
1 


The problem is then essentially a weighting one. 

It should be pointed out that if the components are agricultural sprays, etc., 
then the easiest way to make the spread even may be by making the time 
that the i-th spray runs, proportional to the value of m; which will be ob- 
tained below. This will apply where the sprays cannot be regulated to deliver 
fluid at different rates. 

It should also be noted that M may be taken to be a density rather than an 
absolute total. For instance, suppose AB to be the line of intersection of a 
vertical plane and a horizontal rectangular field with 2 sides normal to the 
plane (see Figure 1). Suppose further that the field is being aerially top-dressed 





























Fig. 1 


by a plane flying at a constant height along the path ---XYZW.- 
which is such that when the plane is discharging over the field (e.g. along 
XY), its direction of flight is normal to the vertical plane. The rate of discharge 
on any one trip, such as XY or ZW, from one end of the field to the other, is to 
be assumed constant for the duration of that trip. If the length of the field is 
measured perpendicularly to AB and M now represents the density of fluid 
per unit of the length, then from the assumption in the last sentence, M will 
certainly be constant. The S; will represent the points at which the path of 
flight intersects the vertical plane, while the rate of discharge, r;, for the trip 
determined by S; will be connected to m; by 


m; = 7:/V, 
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where V is the constant speed of the aeroplane. (The m; are also densities per 
unit length. In practice, it might again be preferable to take the rate of dis- 
charge as constant and instead alter the time factor 1/V). To sum up then, 
the field of application of this technique can be widened by suitably changing 
the meanings of M and the m,. 


2. THE SOLUTION 


Using Lagrange’s method, the equations to be solved are 


b ~ M 
f 6,60) ( Limo) - =) ar +0 =0, j=ul,---,m, 
0 1 


4 


> m = M, 
1 


where J is the Lagrange multiplier. 
The first n equations reduce to 


n L M ft M : 
ym f o (x)o(x)dzx+rX= — { o;(x)dx = —» jzl,---: 
i=] 0 L 0 L 


However, from our assumptions on n and R, 


¢:(x) (x) = 0 
when 
|¢—j| >1. 


Further, our assumptions give 


L L 
f @@yae = fora, i 


and 


L L 
fi e@euucrds = f oiladersledde, i= 1, - 
0 0 


These results are illustrated in Figure 2 below. 
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If we now write 


L 
an f (6.(2))*%de, 
L 
b -f $:(x) di41(z)dz, 


= * alk. 
af Tt 
the equations become 
am, + bmz 
bm, + amz + bm; 


bms + ams; + bm, 


bm,~2 + ama-1 + bm, = - 
bmy-1 + amMn = Bb 


m, + me+ +++ +m-1+ m= M. 


Before these equations are solved, it is convenient to prove an inequality 
between a and b. We have 


0< f ” (bil) — dizr(@)) 2d 


zi+(R/2) Zi+l zi+(R/2) 
7 f (6:(2))*dx + (i4s(z)) "de —2 $:(2)digr(n)dz. 


é 2441—(R/2) 244+1—(R/2) 


Hence 
a a : 
0<—+— — 2b, i.e. a > 2b. 
2 2 
Now except for the first and the n-th, the equations connecting m;,—1, mi, 
m,,, and p are of the form 
bmi-1 + am; + bmi41 = w, 


which is a linear difference equation. The general solution for the m; in terms 
of uw is therefore 
m; = Aa‘ + BB’+C 
where A and B are constants still to be determined, a and 8 are the roots of the 
auxiliary equation 
ba? + axr+b=0 


and 





1 Hildebrand, F. B., Introduction to Numerical Analysis. New York: McGraw-Hill Book Company, 1956, p. 203. 
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C = p/(a + 2b). 


From the inequality a> 2b, it follows that a and £ are real. To determine A 
and B in terms of u, we have the first and n-th equations which give 


b 
A( ba? B bp?) = 
(aa + ba*) + Bas + bp?) 4b" 


| A(aa* + a*~b) + Bap" + bg") = <a ML. 
Noting that a8=1, we see that 
B = a®!A 
and 
—p l 


 a+2 lta 





Hence for all 7 


a —— 
a+ 2b 1+ a™ 


y(1 — a‘)(1 — at‘), say. 





Ql _ a‘)(1 = anti-s) 


Since af =1, we may choose @ to be the root such that | a| <1. Then the m, 
must be all of the same sign as y. Also 


M=y > (1 — a‘)(1 — a®t4 
1 


s0 ¥ is positive. Finally, eliminating 7, we find 
M(1 — a)(1 — a)(1 — a4) 
‘ n(1 + a)(1 — a) — 2a(1 — a”) 





m 


That these values do in fact give a minimum of S may be proved as follows 
M n 2 n—-1 - 
S+ r =a . e m; + 2b > MiMist, (x mM, = u), 
4 1 1 1 


is a quadratic function of m;, mz, --~-, m,-1, and hence can be reduced by 
suitable transformations to one of the forms? 
M? r 2 
S+ Tr = DY vyi t+ ayers, 
1 


mM fs 
s+ = DrwitC’, 
1 





? Birkhoff, G. and MacLane, S., A Survey of Modern Algebra. New York: Macmillan Company, 1957, p. 280. 
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where »;>> -- + >v,, no »;=0, and d>0. Since S+(MJ?*/L) is always posi- 
tive, the first form does not apply, so C’ must bea minimum value of S+ (M?*/L). 
Since the equations for the m; show only one stationary point exists, then the 
point which we have determined must be a minimum. 

3. THE COEFFICIENTS a, b 


If ¢; is a symmetrically truncated normal distribution, 


wi z1+(R/2) 0 
lexp {| — |x — 2;]? exp { —|a — x;J?)dr 
o- (= | a a : r(= nt) as 


(7) =. R : 
when 2; — rs <z<s2%+- 


10 otherwise, 


kiv3 
2 
a f e~/4dy/dy, 
0 


2 (k-C,) (WE F 
b — eels f ev 2dy/dy, 
J 9 


where 


R 
k= — 
cg 


L 168 R ] 
C, =——-— - 


n-l oa 

The denominator, d;, is not written out, as the m; are homogeneous func- 
tions of a and b and hence depend only on their ratio. 

If the component at S,;,i=1, - - - , , can be regarded as a point source dis- 
charging equally in all allowable directions, then ¢, has the form of a truncated 
Cauchy distribution, 

Yi R R R 

od / eta M%—-— Sr scate 

oi(x) = yys? + (x — 2)? 2yi 2 2 
0 otherwise. 


For instance, if the component is an agricultural spray with a nozzle shaped 
like a thin “piece of pie,” with relatively small arc-length (so that the pressure 
on the face may be regarded as uniform) and if the perforations are evenly and 
thickly distributed over the face, ¢; will be approximately Cauchy. (y; must be 
measured to the centre of curvature of the face.) If the spray is not held hori- 
zontal, but is elevated through an angle a, y; must be replaced by y; cosec 
a—see Figure 3. 
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Fig. 3. MN is vertical, the straight line joining KH horizontal. 


For the Cauchy distribution, 


(tan +. ‘ ) [a 
1+ r? 


2C 1 2 
: (21 Zs — + C,[tan-'r — tan-! (C2 — nl) [a 


~ 444+ 814-7? 


where 
R L-R 1 


r=—, C; = -—, and p= y; cosec a. 

2p n-l p 

A Cauchy distribution also occurs if the components consist each of a pair 
of the common revolving type of spraying devices with the separate members 
mounted one above the other and revolving in opposite directions, provided 
that nearly all of the fluid is sprayed out perpendicular to the revolving arm. 
(For imagine particles to be uniformly and continuously distributed on the 
circumference of the circle z*+-y*?=r". If the particles are projected tangentially 
towards the line y= —p, where p>r, it is easy to show the density of hits along 
y= —p obeys a Cauchy law.) 


4. A TABLE OF RESULTS FOR THE NORMAL CASE 


There follows a short table of results for the normal case. The value of the 
parameters are given by 


L = 100 M = 10 
4 
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TABLES OF RESULTS FOR THE NORMAL CASE 


TABLE 1 
M =10 L=100 








k=} 





m, =0.2998, m2 =0.2936, 
Mg=M,= +++ =m,7=0.2938 (correct to 4 figures) 





m, =0.5992, m2=0.5865, 
Ms=m,= +++ =m,=0.5868 (correct to 4 figures) 





m, =0.6966, m:=0.6605, m;=0.6624, m,=0.6623, 
m= +--+ =m,=0.6623 (correct to 4 figures) 





m, =0.7968, m:=0.7631, m;=0.7645, 
M,= +++ =m,=0.7645 (correct to 4 figures) 











my, Me m3 Ms = M6 


.8172 .8233 : 0.8229 
.9710 -9827 . 0.9816 
.0841 -0927 . 1,0920 
. 2344 . 2369 
. 1964 .2183 
-4026 -4073 
.3603 .3864 
.6529 -§539 
.6082 .6227 
.5585 





.8909 
.0833 
. 1852 
-2919 
. 3696 
.4865 
.5620 
-6932 
. 7691 
.8408 


ell eel eel eel al el el ee ed 
tt et et et et et et CO 
tt et et et et et et CO 


AH ~151 00 00 























= 





Crororor 





~ 
= 











WBCWWWWWWWWWWWWKWWKWHKOR EE EEE 
Pr PP eR OCW WWW WWONNN NN hth 
COFFE RK NNNNNNWWWNNNNNNHNN 
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TABLE 2 





m, =0.2995, m2 =0.2937, 
My =m, = ++ + =m; =0.2938 (correct to 4 figures) 





m= 0.5987, M2= 0.5866, 

Ms=m,= +++ =m,=0.5869 (correct to 4 figures) 
m, =0.6956, m2:=0.6608, m;=0.6625, m,=0.6624, 
Ms= +++ = msg =0.6625 (correct to 4 figures) 








m, =0.7958, m:=0.7634, m;=0.7647, 
m,= +++ =m,=0.7646 (correct to 4 figures) 











.8235 0.8231 0.8231 
. 9829 0.9818 0.9819 
.0931 1.0924 | 1.0925 
. 2373 1.2372 
.2186 1.2159 
.4079 1.4077 
.3867 
.6544 
.6235 


ms: m3 ms, Ms = M5 
.8178 
.9717 
.0849 
. 2349 
.1972 
.4035 
.3612 
.6535 
.6097 

















tt et et et et et 


sl _. 
. 
et et et et et OO 














> hi PRO WO WW WH dW N NNN ND | 
) -or 
COOK RENN NNN NWOWHO NNN NHWN 
> € bo ¢ ° 
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TABLE 8 








k=2 





m, =0.2952, me= +++ =m,,=0.2940 (correct to 4 figures) 


m; =0.5906, m,= +++ =ms=0.5879 (correct to 4 figures) 


m, =0.6745, m.=0.6654, ms=--++ =ms=0.6655 
(correct to 4 figures) 








m, =0.7759, m:,=0.7679, m,= +--+ =m;,=0.7680 
(correct to 4 figures) 














= 


m3 Mig = 6 


.8298 . 0.8298 
. 9930 § 0.9939 
. 1046 q 1.1046 
- 2466 } 
2355 
.4225 
.4090 
.6638 
-6507 
.6324 


| 
| 
i 
| 


0.8514 
1.0293 
1.1347 
1.2604 
1.2970 
1.4441 
1.4817 
1.6724 
1.7002 
1.7433 


_ 

bo 
tt ft et fed et pt et > 
et et fet et et et et CO 


ee 
DBSRAaAnNNWHOOOS 























Me Mes 
1.9864 1.9869 
.9627 1.9669 





1 
1.9282 1.9434 
1.8818 1.9214 





3 


Nile 

-4830 
4552 
.4166 
.3679 
.3105 
. 2460 
.1762 
.1014 
3184 
2839 
. 2327 
.1647 
.0752 
. 9639 
8294 
6669 
.4756 
. 2469 
.9801 
.6671 
.3952 
.8492 
.3214 
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An inspection of the tables gives a good picture of the behaviour of the m,. 
For instance, the tables indicate that the maximum value of m; occurs when 
i=1 or n. That this is true in general follows at once from the fact that m, is 
greater than or equal to m, according as a is less than or equal to a"*!~*, An 
inspection will also indicate that there is a certain degree of alternation between 
the m,. In fact, provided 3<i<(n+1)/2, we have either 


M2 > M; > Mj-1, 


M2 <M; < Mj-1. 


The first inequalities hold when 7 is odd, and the second when # is even. These 
results follow from the inequalities 


ants << a, ant? at a? 


which are true if 7 is less than or equal to (n+1)/2. 

The explanation of this behaviour is that m, and m, must be greater than 
the others to offset the lack of components to increase the fluid on their out- 
side wings. Then, because of the increases in m, and m,, mz and m,_; must be 
decreased, etc. 

It will also be noted that, for constant n, the difference between the mini- 
mum and the maximum m, increases with R, and that as n increases, there is a 
tendency for the m; to approach equality. 


The author wishes to thank Mr. S. J. Prokhovnik for his help in computing 
the tables on UTECOM, and Mr. J. Butcher, of the University of Sydney, for 
his interest and very helpful suggestions. 





EXACT AND APPROXIMATE DISTRIBUTIONS FOR 
THE WILCOXON STATISTIC WITH TIES 


SurgLeEY Youne LEHMAN 
National Bureau of Standards* 


The exact distribution of the Wilcoxon statistic was calculated for 
five specific cases where both samples contained five observations, some 
of which were tied. The five specific cases were obtained from an actual 
experiment. The exact distributions for each of these five cases are 
compared with scme approximations at points near 1, 5, and 10 per 
cent. Critical values and associated probabilities have been calculated 
for the exact and approximate distributions. 


HE exact distribution of the Wilcoxon rank-sum statistic for two-sample 
fiat is given in Table 2 for five specific cases where both of the (inde- 
pendent) samples contained five observations, some of which were tied. The 
ties were treated by the method of mid-ranks, as suggested by Kruskal and 
Wallis [2] (see also Putter [3]), and the distributions were constructed by ob- 
serving the frequency of rank sums obtained from all possible combinations of 
ten ranks taken five at a time. The five specific cases, whose observed ranks 
are shown in Table 1, came from an experiment in which approximations of 
untested accuracy could not be relied upon. Consequently, the corresponding 
exact distributions were calculated and recorded in an unpublished National 
Bureau of Standards report of August 1952. Recently it was suggested that 
these distributions be compared with some approximations at points near 1, 5, 
and 10 per cent. 

It is realized that the results presented here apply, in the strictest sense, only 
to the five specific cases investigated. However, since the results of the normal 
approximations are in such good agreement among the five cases investigated, 
it is felt that these approximations should apply to other cases where there are 
sample-sizes m=n=5 with ties among the observations. The tebles are useful 
in that they serve as an illustration of how the methodology can be used when 
it is important to have exact probabilities. The results of these comparisons 
with approximations may be useful to statisticians contemplating the calcula- 
tion of exact distributions.** 

A critical value, R, and its associated probability were calculated at the 
probability levels closest to each of the 1, 5, and 10 per cent points (one-sided 
and two-sided) for sample sizes m=n=5 with no ties among the observed 
ranks and for the five specific cases having sample sizes m=n=5 with ties 
among the observed ranks for the following 

i) the exact distributions 

ii) the normal approximation to the distributions 

iii) the normal approximation to the distributions with a continuity correc- 

tion. 





* Now at the Research Triangle Institute, Durham, North Carolina. 
** A referee has pointed out that, with a little ingenuity, schemes for rapid computation of exact distributions 
can be worked out to suit a particular case. If the computation is deemed necessary, it is not formidable, 
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TABLE 1. OBSERVED RANKS FOR SAMPLE-SIZES m=n=5 
IN FIVE SPECIFIC CASES INVOLVING TIES 











Case | Observed Ranks 





91 
23 


24 


The test statistic, R’, to be compared with the critical value, R, is the larger 
of the two rank sums obtained from the ten observations with the obvious 
modification for the one-sided case. Critical values and their associated proba- 
bilities were calculated from a table of the exact distribution m=n=5 without 
ties (e.g., Table A-20 in Dixon and Massey [1]). Following Kruskal and Wallis, 
[2, pp. 590-3], the formula 


2R’ — n(N + 1) 





(1) 


yee +DN-1)-S1)-N—-n 


3N N-1 

was used as the approximate unit normal deviate. Throughout this study n=5 
and N=10. If ¢ is the number of tied observations in a group then define 
T =(t—1)t(t+1) =@—t and 5°T is the summation over all groups of ties. In 
the case of no ties )} T=0. The continuity adjustment was made by replacing 
R’ by (R’—}3) in formula (1). 

In general, determination of the critical value, R, corresponding to a desired 
probability level involved equating formula (1) to the appropriate normal devi- 
ate and solving for 2R, to obtain 2R=z say, which did not turn out to be an 
integer. The two nearest integers, i.e., [x] and [x]+1, where [z] denotes the 
greatest integer not greater than xz, were substituted into the equation to ob- 
tain two normal deviates, and the probability level associated with each was 
then obtained from a table of the normal probability integral. The one of these 
integers whose associated probability level was closest to the desired probabil- 
ity level was selected as the “best” value of 2R, and was then halved to obtain 
the critical value R. (It is worth mentioning that if the number of observations 
tied in any one set of tied observations is even, then the rank sum can take 
half-integer values. Since approximations corresponding to the no-ties case are 
being considered as possible approximations for use in the five specific cases 
involving ties, the critical values, R, have been determined to the nearest half- 
integer.) 

As an illustration, consider the normal approximation at a=.01 (one-sided) 
for Case III. There are three groups of ties with t;=3, t2=3, ts=2; hence, 





THE WILCOXON STATISTIC WITH TIES 
>°T =54. Formula (1) then is 
2R — 5(11) 


fa — 54] 5 
30 9 


2R = 76.663. 


= = 2.327, 











which gives 


To determine 2R to the nearest integer, 76 and 77 were substituted separately 
in formula (1) to obtain two normal deviates. The probability level associated 
with each was obtained from a table of the normal integral: 


76 — 55 


—— = 2.256 with Pr(.012), 
9.3095 


77 — 55 : 
——— = 2.363 with Pr(.009). 
9.3095 


Examination of the two results indicates that 2R=77 with associated proba- 
bility (.009) is closest to the desired a=.01 (one-sided) level, thus R = 38.5 is 
the “best” critical value. From the exact distribution for Case III shown in 
Table 2 the actual probability level for 38.5 is .012. Thus, if the normal approxi- 
mation were used in this case we would say the critical value is R=38.5 with 
associated probability (.009), when in fact the actual probability from the 
exact distribution is (.012). It should be noted that due to the thinly populated 
tails of the exact distribution there is considerable latitude in the rank sums 
having probability .012, i.e., in the exact cumulative distribution for Case ITI, 
rank sums from 37 to 39 each have probability .012. 

The results are presented in Table 3. For the exact distributions the asso- 
ciated probabilities in Table 3 are indeed true probabilities, but for the normal 
approximations the associated probabilities in the table are the nominal prob- 
ability values which correspond to the unit normal deviates obtained in the 
calculation described above. 

In comparing the approximations with the exact distributions in Table 3, 
Case II appears to be worst, but recall from the observed ranks in Table 1 that 
Case II is the most heavily tied case among the five specific cases. 

One consideration in the use of approximations is whether the actual sig- 
nificance levels obtained with approximations are the same as the significance 
levels for the exact distributions. Comparison of the approximations in Table 
3 with the exact distributions in Table 2 indicates that there is good agreement 
most of the time. The best agreement is near the a=.01 (one-sided) level. The 
a= .01 (two-sided) level errs in that the actual probability obtained is too low. 
This is due to the small sample sizes involved and the use of an approximation 
to a discrete distribution where the tails of the discrete distribution are thinly 
populated. 

When an approximation in Table 3 leads to an actual significance level dif- 
ferent from that of the exact test, it is about equally often too high or too low 
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TABLE 2. EXACT DISTRIBUTIONS OF RANK SUMS FOR SAMPLE SIZES 
m=n=5 IN FIVE SPECIFIC CASES INVOLVING TIES 





Rank Sum | I 








Smaller Larger F 





15 40 
15. 39. 
16 39 
16. 38. 
17 38 
17. 37. 
18 37 
18. 36.6 
19 36 
19. 35. 
20 35 
20. 34. 
21 34 
21.8 33.5 
22 33 
22. 32. 
23 
23. 31.! 
24 31 
24. 30. 
25 
25. 
26 
26. 
27 
27.5 

mean 

variance 

std. dev. 


or bo 
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F =frequency. 
CP =cumulative probability. 

for vaiues near the a=.05 and a=.10 levels. In general the continuity correc- 
tion seems advisable because approximations based on it show better agree- 
ment with the actual significance levels. It is more important to use the con- 
tinuity correction if the critical value, R, is rounded to the nearest integer; and 
of less importance to use the continuity correction when rounding the criti- 
eal value, R, to the nearest half-integer. Results with the continuity correc- 
tion tend to be “conservative” in that when there is a difference the actual sig- 
nificance level is lower than a. The normal approximation without the con- 
tinuity correction tends to err in the opposite direction. 

These observations are consistent with the statement made in the footnote 
on p. 591 of Kruskal & Wallis [2], that comparison of the exact probabilities 
for the two-sample test with those of the approximations indicates that when 
the probability is greater than .02 the normal approximation with the con- 
tinuity correction is usually better. 
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A working guide concerning the use of the correction for ties in the normal 
approximation for the two-sided test (p. 587 of Kruskal & Wallis [2]), states 
that with ten or fewer observations a probability of .01 or more will not change 
by more than ten per cent, when the correction for ties is used in the normal 
approximation formula, if not more than one-fourth of the observations are 
involved in ties. Below are listed the actual percentage changes which occurred 





| Normal Approximation Probabilities and the 
| per cent change in the five specific 
| Observations cases involving ties 
| involved in 

ties | (two-sided) (two-sided) (two-sided) 
a= .01 a= .05 a=.10 


Pp % Pp % Pp % 


No ties (.009) (.047) (.094) 
I : (.010) 11 (.044) 5 (.090) 

Il (.009) 0 (.051) (.104) 
Ill | (.010) ll (.053) : (.108) 
IV ; (.011) 22 (.045) (.092) 
Vv f | (009) 0 | (.046) (.094) 




















when ties were considered. The working guide is confirmed because Case V 
is the only case in which less than one-fourth of its observations are involved 
in ties, and the probability changed by less than ten per cent. It is interesting 
to note that among the remaining four cases there are several cases where the 
change was less than ten per cent even though more than one-fourth of the 
observations were ties. 

The reporting of these results was suggested at various times by C. Eisen- 
hart, I. R. Savage, and W. H. Kruskal. I should like to gratefully acknowledge 
the help and guidance Joan R. Rosenblatt has given me throughout the prep- 
aration of this material. 
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ON THE USE OF PARTIALLY ORDERED OBSERVATIONS IN 
MEASURING THE SUPPORT FOR A COMPLETE ORDER 


R. F. Tate! 
University of Washington 





A random experiment which results in some ordering of the M events 
E,, Ex,+--+, Eu may be such that these events are observable not 
individually, but in the form of ordered subsets. A discussion is given 
for the partial order problem of determining information about one or 
more complete orders of the M events from partially ordered observa- 
tions. Relationships are mentioned between the (rank-order) methods 
of the present paper and those of Rank Correlation and the k-sample 
problem. 





1. INTRODUCTION 


ONSIDER a random experiment &, every trial of which results in one or an- 
C other random order of the M events FE, E2, - - - , Ea. Assume that the M 
events always occur one at a time, but that for a given trial information 
relevant to order is obtained only in the form of an ordered collection of subsets 
of these events; within any subset nothing is known about the order of occur- 
ence of events from the original sequence. The number of subsets, k, will be 
restricted by the condition 2<k<M; both k and the subset sizes may depend 
on chance, or be at the discretion of the observer. 

In the case of an experiment & of the type mentioned above the trials may be 
governed by some law of stochastic order. That is to say, the results of these 
trials may tend to cluster about some particular order, for example 
E,<E.< --+ <Ey. In such a case the order E,<E2:< --- <Ey will play 
the role of a parameter-value, and will be referred to as the “true” order. The 
degree of clustering will vary widely, of course, depending on the experimental 
situation. In regard to the experimental situation it should be remembered 
throughout the paper that the determination of the partitioning (that is, the 
value of k and the subset sizes) for the different trials, may be forced on the 
investigator by the circumstances underlying the experiment, or may be a free 
choice made by him. 


Hzample 1: A class of M students is given one or more timed manual dexterity 
tests, the time allowed generally being sufficient for only a portion of the stu- 
dents to finish. The teacher leaves the room, but returns periodically to note 
the latest subset of students who have finished. Depending on the number of 
times the teacher returns to the class the number of subsets may vary greatly. 

In this example the students may have nearly the same degree of manual 
dexterity, in which case their true ranking may be frequently violated by their 
performance in repeated trials, and the amount of clustering about any one 
order can be expected to be low. 


Example 2: A sample of dead human fetuses is to be examined to determine 
information on the order in which bone centers form from cartilage for the 19 





1 Research spousored by the National Science Foundation (grant G14284), Office of Naval Research (Navy 
Theoretical Statistics Project), and United States Public Health Service (National Institute of Dental Research, 
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bones of the foot. Those fetuses for which 1 to 18 bone centers have formed 
give data pertinent to the investigation, since solutions are available which 
cause bone centers to take on a certain color. X-ray of a living fetus would be 
an inaccurate and near-fatal method of investigating this question. Thus, there 
are of necessity two subsets on each trial, the set of bone centers formed, and 
those not formed, when the fetus is examined. Such a problem was considered 
by Kraus [3], and will be reconsidered in Section 3 as Example 7. 

This example is of a different type, since it is felt by many osteologists that 
the order of formation of bone center from cartilage is genetically determined. 
The degree of clustering about the “true” order should then be high. 

Some problems can now be stated which are of interest in connection with 
“true” orders, and which will be solved in some detail in Section 3. 


Problem 1. Determine from N trials of an experiment & whether a specified 
order should be considered a candidate for “true” order, or whether it should be 
removed from consideration. 


Problem 2. For two possible “true” orders determine if one specified order is 
more likely to be the “true” order than is the other. 


Problem 3. Estimate in some way the “true” order. 


In order to solve these problems we need the concept of “support” for an 
order. A model is developed in Section 2, and from it is derived a statistic 
®, based on N trials of &, which measures the degree of confirmation offered by 
the experimental observations, for any possible “true” order for the M events. 
® will be termed the support for this order. 

Derivations and proofs of optimal properties for ® are contained in Appendix 
A. Other technical discussions are also deferred to the appendix, so that the 
main ideas and applications will not be obscured. The approach will be a 
unified one throughout, based on the notion of support. 

For the case of one trial (V = 1) our solution to Problem 1 is essentially a re- 
interpretation of Kendall’s application of r (the coefficient of rank correlation) 
to a situation involving tied ranks (see Kendall [2], chapter 3.6). The solution 
also corresponds with the independent work of Terpstra ({9], [10]) and 
Jonckheere [1] on the problem of & independent random samples. For N>1 
the solutions to our problem and the k-sample problem differ; the distinction 
between the two is presented at the end of the discussion of Problem 1 in 
Section 3. Asymptotic normality is employed for # in an application in Section 
3, and is proved in Appendix B; the result is similar to limiting results obtained 
in references [9] and [1], but the assumptions and the proof are quite different. 
Problem 3 is also considered by Kendall ({2], Chapter 6.10), but his solution 
differs from that presented here; the two methods are compared in an example. 
Problem 1 for N>1, and Problem 2 do not appear to have been previously 
treated in the literature. Finally, consistency and unbiasedness of tests based 
on # are proved in Appendix C. 


2. DERIVATION AND PROPERTIES OF ® 
The application of the ensuing methods calls for the consideration of two 
kinds of ranking procedures. We can illustrate these by considering a single 
trial of the experiment. 
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Consider any experiment & and a specified order 2, < E:< ---<£y. Asingle 
trial provides us with an observed partial order for these M events, in the form 
of subsets. For each subset the events contained therein can be labelled accord- 
ing to their places in rank in the specified order. This provides us with an order- 
ing for subsets of ranks: 


Si<8.<---<&, k=2,3,:--,M. (2.1) 


For each subset S, denote the number of elements by ny. If we consider two 
different specified orders, then we must consider along with them two different 
orderings of the form (2.1). 


Example 3. Let an experiment result in M =6 events. Denote these by A, B, C, 
D, E, F. Now, let the two specified orders referred to above be (1) A<B<C 
<D<E<F, (2) B<C<A<F<E<D, and let the experimental outcome be 
{B, F} <{C}<{A, D, E}. Now, we have M=6; k=3; ni=2, n2=1, n3=3. 
Specified Order Si< S82 <8; 
(1) A<B<C<D<E<F _ {2,6} < {3} < {1,4,5} 
(2) B<C<A<F<E<D_ {1,4} < {2} < {3,5,6} 


Let us adopt as short forms for these two orders, respectively, 
(26|3| 145),  (14| 2| 356). (2.2) 


This notation will be maintained throughout the paper. As one might suspect 
it will develop from later considerations that the observation considered here 
offers more support for order (2) than for order (1). Note that within any sub- 
set, {3, 5, 6} for example, the numbers are written in increasing order. This 
is purely for convenience in reading; no order for the events corresponding to 
these numbers is known. 

The main purpose of the first ranking procedure, illustrated in Example 3, 
is to allow us to write down an observation in as many concise forms as we have 
particular orders under consideration. For our development of the notion of 
support it will, of course, not be necessary to consider more than one specified 
order, and hence not more than one short form for the observation at any one 
time. 

We have seen that through the first ranking procedure an observation can 
be thought of as an ordering S:<S:< - - - <S,, in which each S) is a subset of 
the ranks 1, 2, - --, M, and that these subsets depend on the particular order 
of events under consideration. The increment of support offered by S, should 
depend not only on the elements of S,, but among other things on the relation 
of these elements to those in succeeding subsets. In the observation (26| 3] 145) 
for the specified order (2) of Example 3, the rank 3 is important for its magni- 
tude, but also in that it is the second lowest number which occurs in either the 
second or third subset. This suggests that for a discussion of support we might 
consider pooled sets from S, on, for each \, and a new ranking procedure which 
assigns to the numbers in S) their ranks in the corresponding pooled set. Ac- 
cordingly, the following notation is introduced: 


T,=pooled set formed by S,, Sy41,  - * , Se; 
m, = number of elements in 7). 
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ry;=the ith largest number in S, when these numbers are compared with all 
of those in 7',, t=1, 2, +++, m. (2.3) 

Thus, for the extreme pooled sets, 7; and 7'y, we have 


T,={1,2,---, M}, m=M, {ru, ris, «++, rin} =S3; 
Tr= Si, my = Nk, { res, Tag, 35>'* » Tans} = {1, 2, ee % , me}. 


Example 4. For the observation (26|3| 145) of Example 3 we have 


T= 1, 2, 3, 4, 5, 6} »m,=6; r11=2, re = 6. 
T2={1,3, 4,5}; me=4; rn =2. 
3= 1, 4, 5}; m3=3;rn=1, r32=2, T33= 3. 


Our assumptions for ® are phrased in terms of the notation above. It should 
be noticed at the outset, however, that S,, having no successor, will not con- 
tribute to the support if the present point of view is adopted.? Therefore S;, 
and hence 7; will not enter into our assumptions. 


MODEL FOR ® 


1. If a particular order is specified, then the support ¢ offered for that order 
by a single observation S;<S:< --- <S, is the sum of numerical contribu- 
tions made by the first k—1 of these subsets. For each subset S, the contribu- 
tion depends on the ranks of its elements when considered as elements of the 
larger set 7',. Also, these contributions must satisfy the following two condi- 
tions: 

(i) For those two possibilities consisting of S, being made up of either the 
n, elements of smallest rank in 7, or the n, elements of largest rank in 7), 
the contributions have the same magnitude; the former contribution is positive, 
and the latter is negative. 

(ii) If for a given observation a pair of numbers, one from S,, and one from a 
higher subset, are exchanged, there results an addition to the contribution of 
S, by an amount proportional to (x—y), where z and y are the ranks assigned 
to these numbers (that is r,,’s)}, z to the element of S,, and y to the element of 
the higher subset. 


2. The number of subsets may vary from one observation to another, and the 
total support for all N observations is 


$=) ¢. 


Let us look at some illustrations of 1(i) and 1(ii). Consider the first subset 
Si={1, 2} from the observation (12]456|3). By 1(i) the contribution of this 
subset is positive. If the observation were (56| 123|4), the contribution of the 
first subset would be negative, but equal in magnitude to that of the subset 
previously mentioned. In order to illustrate 1(ii) let us first assume that the 
constant of proportionality is c>0. Now by 1(ii) the addition to support of the 
2nd subset of (26| 145|3) by interchanging 1 and 3 would be c(1—2)=—c, 





* The whole procedure to follow could also be formulated in terms of predecessors of S, instead of ors, 
with 7) the pooled set formed by S:, Ss, « + + , S), and so on. 
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actually a decrease in this case. The z of 1(ii) is 1 in this example, and the y is 
2, because 3 has the rank 2 in 7’>. 

It is shown in Appendix A that the conditions given in the Model for ® 
will determine it uniquely, except for an arbitrary constant multiplier. The 
exact expression is 


1 k-1 ™ 
&’=) 4, 6= 5 eX {mim +1) -2 En ; (2.4) 


” Awl tel 


the c which appears here is the constant of proportionality mentioned in 1 (ii) 
of the Model. If we choose c=2, then ® can be shown to be the statistic of 
Kendall [2], Terpstra [9], and Jonckheere [1]. For computations there is a 
much better form: namely; 


’=))¢ ¢=2P+3(D?- M%), (2.5) 


where for a single observation 


P=number of times an element of a subset is numerically less than an ele- 
ment of a succeeding subset, 
D?=sum of squares of all k subset sizes. 


The derived form for @ given in Appendix A is (2.4). Following this, form (2.5) 
is shown to be equivalent to (2.4). Later there, under “additional properties 
of &”, a third form is given. 


Example &. Again consider the case M =6, and specified order (1) of Example 3. 
The table below contains a complete calculation of & based on 7 observations. 
Note that the first observation is the one used in Example 3; the second, third, 
and fourth were mentioned above to illustrate 1(i) and 1(ii) of the Model. 








Subset 


Trial Experimental Outcome Short form Sizes 





- 
~ 


{BF} <{C}<{ADE} (26| 3| 145) 
{AB} <{DEF} <{C} (12| 456| 3) 
{EF} <{ABC}<{D} (56| 123} 4) 
{BF} <{ADE} <{C} (26| 145} 3) 
{ACD} <{ BEF} (134| 256) 
{ ABDEF} <{C} (12456| 3) 
{B} <{E}<{c}<{F} <{AD}}] (2]5|3|6| 14) 


S 


- 
— — —& CO 


mow bt ty bs te 
me OO 


_ 

~ 
_ 
to 


_ 


@=0 























For a given observation the calculation of P may be performed by considering 
all elements except those in the last subset, proceeding from left to right. The 
complete calculation for the first line, based on (2.5), is: 


P=34+0+4+2=5, Di=4+1+9=14, M* = 36; 
@ = 2(5) + $(14 — 36) = — 1. 
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For the calculation of the first line, based on (2.4) with c=2 we have: 
m=6, m,=4; tfn=2, ff2=6; fra = 2. 
@ = 26+ 1) —- 22+ 6) +1441) — 22) = -1. 


It is instructive to look at trials 5, 6, and 7. Trial 5 gives two subsets, with the 
order of elements unknown for each, but with 5 and 6 in the second subset; 
the support is +5. Trial 6 may look better at first glance, but actually offers 
negative support. Trial 7 gives more information than any of the other trials, in 
that the ordering is nearly complete, but asserts nothing for or against the order 
in question. 

Several additional properties for @ may be presented at this point. Some of 
these results are well-known in one form or another. For further discussion the 
reader is referred to Appendix A. 

(1) If the n, ranks in 7) associated with the subset S, are symmetrically 
located about the number 3(m,+1), then the contribution of S, to ¢ is zero. 

(2) For an observation with k subsets the k—1 contributions to ¢ are statis- 
tically independent random variables. 

(3) For an observation with k subsets, and fixed subset sizes, the maximum 
possible support is 


k—-1 
>> na(m, — ny) = 4(M? — D?). 
A=l 


The minimum possible support is the negative of this quantity. 

(4) For an observation with k subsets the maximum possible support men- 
tioned in (3) is greatest if each subset has M/k elements. 

Finally, the last paragraph of Appendix A indicates another derivation of 
&, based on a different ranking procedure. 


3. SOLUTIONS TO PROBLEMS 
PROBLEM 1 


This problem, stated in the introduction, is that of determining whether a 
specified order is a candidate for “true” order, or whether it should be removed 
from consideration. 

We can test the hypothesis of “randomness,” that is to say that all M/ 
permutations of the M events are equally likely. Moreover, and this is the 
important point, we can do it in such a way that rejection of the random order 
automatically gives evidence in favor of the specified order. This is clear from 
the way ® is defined, in terms of the specified order. If @ turns out to be signif- 
icantly greater than zero, then the specified order is definitely a candidate; of 
course, many of the M/ possible orders may be better candidates. If the con- 
trary occurs, then the specified order should be rejected. 

For the case of a single trial (NV =1), critical values for @ under the assump- 
tion of “randomness” have been tabulated by Terpstra [10] and Jonckheere 
[1]. The former actually tabulated for our P (T in his notation), but a simple 
transformation gives ®. He used k=3 and 1<n,;<nz2<n;<5. The latter tabu- 
lated critical values for # directly, using k=3, 4, 5, 6 and an equal subset size n 
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such that nk <16. These results for the case N = 1 are recorded here mainly for 
completeness, since the experiment & which we consider will in most cases be 
repeatable. Two special cases also deserve mention when N = 1. First, if k=2, 
& reduces to the statistic of Wilcoxon [11] for the two-sample problem; this is 
our most extreme partially ordered case. Secondly, if k= M, © becomes the 
statistic of Mann [8] for testing against trend; this is our completely ordered 
case. 

No exact sampling theory has been developed for N>1, but in most cases a 
normal approximation will give sufficiently accurate results. What is required 
in order to validate the normal approximation for ® is a special case of the 
celebrated Central Limit Theorem. It is shown in Appendix B that ® is ex- 
pressibie as a sum, and that this sum is approximately normal when the number 
of summands is large, irrespective of the conditions underlying the experiment, 
that is whether or not the assumption of “randomness” is correct. For our pur- 
poses in this section it should merely be pointed out that the number of sum- 
mands referred to is not N, although the proof in Appendix B does require that 
N be large. It is, instead, the number of contributions to ®, that is, the number 
of subsets in N observations minus N. This is the quantity which should be at 
least 15 before the normal approximation is used. 

We see from Appendix B that under the assumption of randomness, the null 
hypothesis for our test,’ 


1 k—-1 
E(#)=0, V(4)=>oe%, o= ry > (m + 1)m(m — m). (3.0) 


Awl 


An alternate form for o? which is easier for calculation is 


1 k 
gi = rem fe on 8 ee 3p’t (3.2) 


A=1 


where D*, as before, is the sum of squares of subset sizes. The following will 
serve as an illustration. 


Example 6. Consider the manual dexterity situation of Example 1. Students 
named Brown, Jones, Shaw, and Wilson take 10 performance tests. Their exact 
order of completing the assigned task is observed in only 3 out of 10 trials. 
Suppose that the teacher’s opinion of the true order of their abilities is S<J 
<B<wW. The question to be answered is: Do the data offer sufficient evidence 
to warrant our consideration of the teacher’s opinion? (See table on next page.) 
Using a one-tail test we now compute #(71.7)~”=1.30. This number falls 
short of the 1.64 required for significance for the upper-tail test, so we conclude 
that the teacher’s ranking cannot be the true one. Actually the data in this 
case would suggest that the students have more or less the same ability. It is 
interesting that in Trial 6 the ordering is complete, and the teacher’s version is 
upheld. 

In the k-sample problem independent random samples of sizes m1, n2,---, 
m, are chosen from k independent populations. Let X;; be the jth observation 





+ In what follows EZ and V will stand for mathematical expectation and variance, respectively. 
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| dn? 
10 


28 
28 


H 





Outcome Short Form 


— 
~ 
we 








{S}<{B}<{Jw} (1| 3] 24) 
{SJB} <{W} (123} 4) 
{W}<{SJB} | 123) 
{J} <{W}<{S} <{B} (2) 4| 1/3) 
{SW} <{B}<{J} (14|2|3) 
{S}<{J} <{B}<{W} (1| 2] 3] 4) 
{\yW}<{S}<{B} (24| 1| 3) 
LJ} <{W}<{SB} (2| 4| 13) 
{S} <|JBW} (1| 234) 

LJ} <{S}<{W} <{B} (2} 1| 4| 3) 








4 


lok a ath Se, hp | 


| 30 66 136 














oo =2) P + 1D — NM*) =2(29) + $(66 — 160) = +11; V() =10(16)(11)/18 — 2(136) /18 — 3(136) /18 
—3(66) /18 =215/3 =71.7. 


from the ith population, and have a distribution function F(x). The problem is 
to test the hypothesis that the k probability laws are identical, that is 
F,=F,= --- =F,. Terpstra ({9], [10]) and Jonckheere [1] considered the 
alternative of an upward trend for the observations in consecutive samples. As 
mentioned earlier they pointed out connections with Kendall’s r, and at the 
same time generalized the work of Wilcoxon [11] for two samples and Mann 
[8] for k individual observations. Limiting results are obtained by increasing 
the sample sizes while k remains fixed. In our problem, which is termed the 
partial order problem, it is the number of events which is held fixed, while k is 
allowed to vary as more and more trials are considered. Summing up, we can 
say that in the k-sample problem the number of subsets in the partition is 
fixed; in the partial order problem the number of elements to be partitioned is 
fixed. It is not possible to establish an equivalence between the problems, by 
pooling data in any way, since in the partial order problem the elements of the 
k subsets for any given observation can properly be compared only with those 
of other subsets for the same observation. In spite of this there is still a close 
connection between the limiting distributions under the assumption of “ran- 
domness,” as will be seen in Appendix B. 

Kruskal [4] and Kruskal and Wallis [5] also dealt with the k-sample prob- 
lem; moreover, they used rank-order methods. However, they considered ranks 
in the overall sample, and a general alternative to the null hypothesis. This 
produced results differing from those of the other authors mentioned above, 
and of course differing from those obtained here. 


PROBLEM 2 


In this problem we pose the question: Is one of two specified orders more 
likely to be the “true” order than is the other? This is generally the real prob- 
lem. In the field of biology, for example, there is frequently a current theory 
of order which rests on at least a moderate amount of evidence, and which it is 
incumbent upon the investigator to dispute if he cares to put forth a theory of 
his own. The solution of Problem 1 is often merely a formality in preparation for 
an examination of Problem 2. 
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A natural solution to Problem 2 is one based on the sign-test. For the two 
specified orders we can calculate supports ¢ and y, respectively, for each ob- 
servation. Then, considering those observations among the N for which ¢ 
and y differ, we can count the number for which ¢>y. If this result is signif- 
icant, we declare the order for which ¢ was calculated to be the better candidate 
for “true” order; that is, we say that the first order receives significantly more 
support than the other. We turn now to an interesting, rather important prob- 
lem with implications in the study of the effect of radiation on human growth. 


Example 7: This is a continuation of Example 2, mentioned in the introduction. 
Kraus [3] was interested in the order of ossification (formation of bone center 
from cartilage) of the 19 post-tarsal bones of the human foot. Such activity 
takes place in an embryo which is in the fetus stage. Most workers seem to feel 
that this process is genetically controlled; there is also a current theory of 
order, call it Theory 1. Kraus presented an alternative theory which we will 
refer to as Theory 2. The two may be described as follows: 





Bones | Mi: M: Ms Ms Ms Di Di Di Ds Ds Pi Se, ae I; 














Theory 1 Bs @yes ‘ Ss: te 861. - i7 








Theory 2 6 2 3 g y 133 4 #1 «16 =«(17 








M,, Dj, P:, and I; stand for metatarsal, distal, proximal, and intermediate 
phalanges, respectively, with increasing order of the subscripts in the direction 
big toe (D,) to little toe (Ds). 

Let be the support offered by a trial for Theory 2, and y be the support 
offered by a trial for Theory 1. The data consist of 138 specimens (dead fetuses) 
cleaned and stained so that the bone centers will be visible under a microscope. 
Every observation gives us the bone centers which have formed, and by 
elimination those which presumably would have formed had the fetus de- 
veloped. Thus, we have Problem 2 with M=19, N=138; k=2 for each ob- 
servation (those specimens for which no bone centers, or all bone centers are 
present have been discarded). 

For a more complete record of Kraus’ data, together with a description of the 
place of his work in the study of the genetic effect of radiation on human popu- 
lations, the reader should see ref. [3] and its list of references. A summary of 
his data will be sufficient for the purpose at hand. For 119 specimens ¢=y, 
and no information is available for Problem 2. For the other 19 specimens 
¢>yw in 16 cases, and ¢<y in 3 cases. This result is significant at the 0.2 per cent 
level if we use it to test that the two theories receive equal support on the 
average; consequently, we conclude that Kraus’ order is a better candidate for 
“true” order. 

This example is of the type alluded to earlier in this Section, in that both 
orders receive significant support (a fact which is apparent from an inspection 
of Kraus’ data), and the main problem is one of comparison. 


PROBLEM 3 


This problem is the most straightforward of the three. The question to be 
answered is: Assuming that there is a “true” order, what is it? The solution in 
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the spirit of the present paper is that the estimate for the “true” order is that 
order which receives maximum support. Kendall ({2], chapter 6, 10) solves an 
equivalent problem differently. He considers a situation in which observers, 
“among which there is evidence of some agreement,” rank a series of objects 
for which there is a true order. The object which receives the smallest sum of 
ranks is “first,” the object receiving the next highest sum of ranks is “second,” 
and so on, with any ties broken by a consideration of sums of squares of ranks. 
Corresponding to his “observers with some agreement” we have repeated trials 
under equivalent experimental conditions. The following will serve to illustrate 
both methods for a situation in which the two results differ. 


Example 8. Let A, B, C, D be 4 events which occur on each trial. For con- 
venience we will consider only completely ordered observations. The vertical 
bar notation for an observation is then unnecessary, and will be omitted. 


EXPERIMENTAL OUTCOMES 








Total 


ABCD 


ABDC 


ABDC 


DCAB 


CABD 





A 


21 
22 
19 
20 
19 

8 


1234(6) 

1243(5) 

1342(4) 

1324(5) 

2314(4) 
1 


1243(5) 

1234(6) 

1324(5) 

1342(4) 

2341(3) 
1 


1243 (5) 

1234(6) 

1324(5) 

1342(4) 

2341(3) 
1 


4312(1) 

3412(2) 

2413(3) 

4213(2) 

4123(3) 
3 


3124(4) 

4123(3) 

4132(2) 

2134(5) 

1234(6) 
2 


B 13 2 3 
C 14 : 4 ‘ l 
D 15 ‘ 3 + 























Columns two through six correspond to observations, while rows one to five 
correspond to possible “true” orders. Numbers in parentheses are P-values 
corresponding to the specified order (row), and outcome (column). The numbers 
in the body of the table are the short forms for the outcomes with respect to the 
possible “true” orders. Total refers to sum of P-values. The last four rows per- 
tain to Kendall’s method; for example, column 5 contains 3, 4, 2, 1 as its last 
four numbers, since they are positions in rank for A, B, C, D in the outcome 
DCAB. Total for these rows refers to sum of ranks. The specified order ABDC 
receives the maximum sum of P-values of 22, and is therefore our estimate.‘ 
Checking sums of ranks, we see that ABCD is Kendall’s estimate. 


APPENDIX 
A. DERIVATION AND OPTIMAL PROPERTIES FOR ® 


In the discussions of Sections 1 and 2 we introduced an experiment &, each 
trial of which produces events F;, E2, - - - , Ey. A trial gives these events in a 
partial order, that is in the form of an ordered collection of subsets By, A\=1, 
2,--+, k. If some particular complete order of events is under consideration 
as a candidate for “true” order in the sense of Section 1, then we define S, 





‘ The specified orders considered here were given because they were the five receiving highest support; all 4! 
possibilities were looked at. Note also that we can compare supports by comparing P-values in this example, be- 
cause D* has the constant value 4 for all observations. 
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as the set of ranks corresponding to the elements of B, with respect to this 
particular order. Also we define 


k 
Ty = U S, 

y= 
with n, as the number of elements in S,, and m) as the number of elements in 
Ty. Moreover, we introduce ranks r,;, t=1, 2,-++, m, A=1, 2,---, k, 
for the elements of 7). 

In terms of the above notation a quantity © is introduced, and assumed to 

satisfy the following conditions: 
1. For a particular trial S:<S:.< --- <S; there is associated with S, a real- 
valued function fy (ry1, Tr2, «+ * y Tan), A=1, 2, ++ +, k—1. Let 


k-1 
¢= > fara, Tha, * * * y Tama), 
del 


with the functions f, satisfying, forA=1,2,---,k—-1: 
(i) fa(1, 2, + +, m) = — film — m + 1,m — my + 2, +> +, m) 
(li) fa(raa, «+ +, Tace—ayy Trin TCH) * ty Tam) 
— faltar, + + + y TRG=1)y Tad MG DD) * +» Tama) = C(x — Try), 
fort? = 1,2,---,mjsj =m+1,--:,m;jA\ =1,2,---,k—lje>0O. 


2. If the experiment is repeated N times independently, with possibly different 
numbers of subsets on the trials and subset sizes, then 


N 
= > du, 


a=1 


where ¢. is defined as in condition 1, for trial a. 
A statistic @ defined as above gives as its value the “support” for the par- 
ticular order, offered by the N trials. 


Theorem 1: If ® satisfies the above two conditions, then © is uniquely deter- 
mined up to a constant multiple. 


Proof: From 1(ii) it may be seen that f,(ra1, 7x2, - * * » Tana) May be increased 
to f,(1, 2, - - + ,m,) by successive exchanges of the elements in the S,’s; that is, 


~ 
fi, , Rey ny) a flr, Paty. f. 2" » Tama) = 6 > (Tri * 1). (A.1) 
i=l 
Substituting the extreme values r,;=m, —n,+7 into (A.1), and using 1(i) above, 
we obtain. 
fx, 2, ae my) = 3ceny(m) = M). (A.2) 


Now, another substitution, (A.2) into (A.1), and the use of condition 1 above, 
gives 
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l 


k-1 my 
@=-—c Z_ {aim + 1) —2 > nib . 
2 A=1 i=l 


@ can now be determined by introducing the a subscript in ¢, k, ny, my, rai, 
and summing over a=1, 2,---, N. This verifies (2.4) of Section 2. 
It has been shown by Terpstra and Jonckheere that if c=2, ¢ is equivalent to 


k—-1 
@ = 2P — DY mm — n). (A.3) 
Awl 
Terpstra and Jonckheere both use S for ¢; also the former uses 7’ for P. The 
form (A.3) was originally studied by Kendall in connection with tied ranks in 
rank correlation. 
We now verify that the computational form for &, expression (2.5), is equiv- 
alent to (2.4). 


k—1 


k—1 & l k 2 k 4 
py m(m, — Ny) = > z. myn; = — I( } m) — 2 m | . 
A=l 


A=1 A=] je A+l ~- hel 


Thus, by (A.3) and the facts that 


k k 

2 2 

M = > nm, and D = 5 Ny, 
A=1 Awl 


we have (2.5). 

Additional properties of ¢ were listed at the end of Section 2. We now dis- 
cuss these. 

(1) If the r,; associated with S, are symmetrically placed around $(m-+1), 
then fi(rar, Tre, * + * 5 Tang) =O. 

Proof: This follows immediately from the alternate form 


; my m + 1 
fulray nia +s nm) = 2D “= ni), 


‘mt 2 


which was essentially obtained in the course of proving Theorem 1. 

(2) For a fixed observation the contributions f,(ra1, Tx2, + +, Tam), A=1, 
2,-+-+,k—1, are independent random variables. (Terpstra [10]). 

Proof: Any rearrangement of elements in the set 7',—S, leaves f, invariant, 
so f, is independent of f,, y >. 

(3) For an observation with k subsets, and fixed subset sizes the maximum 
possible support is 

k—-1 


= ny(m, — nm) = $(M? — D?). 
h=l 


Proof: @ is maximized when P is, and the extreme values of P are obviously, 
0 and 


k-1 
> ny(m, — Mm). 
Awl 


Substitute the latter into (A.3). 
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(4) For an observation with k subsets, the maximum possible support, men- 
tioned in (3), is greatest if each subset has M/k elements. 


Proof: Consider the computational form given as (2.5), and also mentioned 
earlier in the present section. 


is a maximum when all n, are equal, and this is all that is needed. 

Instead of using the r,; to formulate our conditions for 4, we could use f,,4,, 
ranks in the set S, U S,; \=1, 2,-- +, y—-1; y=2,3, +--+, kj; ¢=1,2,---, 
n,+n,. Our conditions would then concern the #,,;,i=1,2, +++, ma, for each 
S, and S,. With the exception of Property (2) above, the derivation, and hence 
the optimal properties hold, with the obvious notational changes n,+-n,—M)y, 
Siar. 

B, ASYMPTOTIC NORMALITY FOR ®. 
In this sub-section we discuss normal approximations. If we use the form 


mentioned in the proof of Property (1), we have, from Property (2) that the 
quantities 


Rar 
1 a¥ x ( met an rox) (B.1) 


t=] 


are statistically independent; A=1, 2,---, ka—1; a=1, 2,---, N. Denote 
the expected value and variance of 


oa = Di San 


by E¢a=pa, V(¢2) =0%. Also, let 72=E| ¢2—pal*. 
Theorem 2: & is asymptotically normal with mean and variance 


N N P 
w= Do de, V =D oe 


awl a=l 


Proof: Let us consider the general case in which a probability distribution 
exists on each trial for the partition into number of subsets and subset sizes. 
Denote the density of this distribution by g.(r). Ruling out the trivial case of 
a non-random trial we can assert that V(¢| x)>0O for each x. Since there are 
finitely many possibilities for +, we can find a lower bound for this quantity 
which is independent of a. It can be shown from this that 


fie Sal ey ~ ( Dr Fle =)a4(n)) >a? >0 


for some a*. From this and the fact that r3 must be bounded above by A, say, 
we see that 
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( ANWUS 
< 


1g 2 aN}/2 : 





(B.2) 


(Ee: 
which tends to zero when N—>~. (B.2) is exactly the condition required in the 
Liapounov [7] form of the Central Limit Theorem. Note that the basic inde- 
pendent summands are actually the f.,. It was merely easier to write the proof 
in terms of the ¢a. 

For the case of “randomness” each r.,; has a discrete uniform distribution 
over 1, 2, - - - , may, So from well-known formulae concerning sampling without 
replacement, we have 


2 
Mar — ] Mar — Na 
E (rari) = 3(mar + 1), Can = sna ( 2 )( *), 


1 Ma — 1 





and finally we have from Theorem 2 asymptotic normality with 
N k—1 


1 
E(#) = 0, V(t) = — bs > nearer + 1)(mMar — Nar): (B.3) 


a=1 \=1 


Thus, we essentially arrive at the result of Jonckheere ([1], p. 138), by a differ- 
ent method of proof. The same thing was derived by Terpstra [9], but ex- 
pressed differently. 

C. DISCUSSION OF SIGNIFICANCE TESTS 


First, consistency will be shown for the test used in the solution of Problem 1 
in Section 3, against a specified set of alternatives. For the sake of discussion 
let us consider a significance level of 5 per cent. Then, the hypothesis Ho is that 
of “randomness,” and the test is 


Reject Hy if © a 1.6400, 


where gp is the standard deviation obtained from V(®) in (B.3). Recalling the 
meaning of E(¢.| 7) from the proof of Theorem 2, we can state 


Theorem 3: The test of Hy is based on © is consistent against alternatives for 
which E (a! x)>0O for each zx. 


Proof: Using the same type of reasoning as in the case of Theorem 2 we can 
show that for some b?< «, and c?>0, 
oa<b, mere a=1,2,---,N. 
Now, let py be the probability of rejecting Ho under the alternative. Then, 
pu = P(® > 1.6400) 


@—p 1.6409—4 @—yp 1.64b/N —c2N 
-P(—Fe Ww is a aV/W ). oP 
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It is known that if a sequence of distribution functions approaches a distribu- 
tion function which is continuous, it does so uniformly. Since ® is asymptoti- 
cally normal by Theorem 2, and 


a-N-1/2(1.64b-./N — c2N) > — &, py —1 by (C.1). 


The following are some general results concerning Problems 1, 2, and 3, 
respectively. 

The significance test of Hy based on # is unbiased against all alternatives 
with F(¢.)>0, a=1, 2,---, N. This is an application of a theorem of Leh- 
mann ([6], p. 256, Example 30). 

The sign test used in Problem 2 of Section 3 is known to be consistent and 
unbiased against all alternatives for which ¢. is stochastically larger than ya, 
which is the appropriate alternative hypothesis for our purposes. 

The solution to Problem 3 has optimal properties by virtue of its being the 
only natural solution based on support, since was shown to be the unique 
statistic having the properties ascribed to it. It should be noted, however that 
condition 1(ii) is rather arbitrary. Any change in formulation here could 
produce a radical change in the form of ®. 
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A NOTE ON MEASUREMENT ERRORS AND DETECTING 
REAL DIFFERENCES 


Evcene Rocot 
National Institute of Neurological Diseases and Blindness 


Two groups, N,=a,+b, and N;=a2,+6:, are compared in terms of 
their respective morbidity rates, p; and pa, in the presence of errors of 
misclassification (a; and a: are sure components; b; and b: are potential 
error terms). The four extreme situations for the compositions of N, 
and N; are considered. Rate p; is assumed to apply to a, and b; pz 
is assumed to apply to a2 and be. If there is a difference in rates for the 
error-free situation, then the observed difference is smaller in each of 
the error situations. Further, measurement errors reduce the chance 
of detecting a real difference. 


I MANY public health studies, the investigator ultimately relies on a very 
simple type of statistical analysis—a comparison of two rates. A typical 











Disease | No Disease Morbidity Rate 








Condition present pizm/N, 





Total p=n/N 








| 
Condition abeent | ist Oe N ps=n/Ns 





The method of analysis here is to compare the morbidity rate for the group 
N, with condition present with that for the group N: with condition absent 
and perhaps to test the observed difference (p:—p2) for statistical significance. 
The null hypothesis, Ho:P:;=P:, might be tested against the one-sided al- 
ternative, P;>P:2, where P; and P; are the population parameters of which 
p; and p2 are estimates. In this instance a binomial model or where appropriate 
the normal or some other approximation to the binomial may be used. The 
probability of making the Type I Error (rejecting the null hypothesis when 
it is true) is usually set at a maximum of 5%; the Type II Error (accepting the 
null hypothesis when it should be rejected) is usually unknown. In this paper 
we will be concerned only with the latter kind of error—in particular with 
“measurement errors” and how they affect the Type II Error. A real difference 
in rates will be postulated so that the probability of making the Type I Error 
is set at zero. By a “measurement error” we mean any of the following singly 
or in combination: 


E-1: an “sueenee should be classified in Ni, but is erroneously classified in 

E-2: sasninenil should be classified in N2, but is erroneously classified in 

pessenei should be classified in n, but is erroneously classified in 

E-4: an individual should be classified in N —n, but is erroneously classified 
in n. 


314 
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Consider now only errors E-1 and E-2. (For simplicity the errors E-3 and 
E-4 will not be considered in this paper. They can, however, be treated in the 
same way as E-1 and E-2, and would show the same kinds of relationships.) 
Assume N,=a,+b, where a; are those individuals certain to be classified in 
N, and b, are those individuals who may for one reason or another be erro- 
neously classified in N2. Similarly, assume N2=a2+ 2, where az is the sure com- 
ponent and b, is the potential error term. Suppose now we have a typical setup 
as above and we compare p; with p:. What is the observed difference? Is it 
significant? Specifically, assuming a difference in fact exists, i.e., (Pi1— P2) =d, 
where d>0, what will be the observed difference under the following extreme 
situations?! 
vn? 


(2) 


a; -} by and Ns? = Ae be (1) 


N, ay and NS = de s- be 4+ by (2) 


~ (3) 


Ni a; + b; + be and N;” = Ag (3) 


(4) 


Ne” un gait be Gd-Me. 00+ By (4) 


Situation (1) is the theoretical one in which no measurement errors arise. 
Let pi=P; and po=P 3. Then p;—pe=d. We also assume here and throughout 
this paper that the rate p, applies to a; and alse to b,; similarly, the rate p. 
applies to a2 as well as to be.? 

In Situation (2), the measurement error is such that all of the individuals who 
constitute the potential error term b; are erroneously classified in N». This is 
the kind of error described under E-1. 

In Situation (3), the measurement error is such that all of the individuals 
who constitute the potential error term b: are erroneously classified in N;. This 
is the kind of error described under E-2. 

In Situation (4), the measurement error is such that all of the individuals who 
constitute the potential error term }; are erroneously classified in Ne, and all 
of the individuals who constitute the potential error term b, are erroneously 
classified in N;. This is a combination of errors of the kind described under 
E-1 and E-2. 

The calculation of observed differences in rates for the four situations 
(labelled dj, dz, d; and dy, respectively) is shown in Table 1. 

For Situations (2), (3) and (4), the observed difference in rates is always less 
than that for Situation (1), since in each instance we have the relationship 
ed where e<1. It is important to note here that ratios of the observed differences 
for the four situations are independent of d, the actual difference in rates. In 
Situation (4) it is even possible to observe a negative difference in rates, if 
bib. >aia2. This would mean in spite of a real difference in rates, where p:> pe, 
due to measurement errors we may observe p:2 to be greater than p;. Further 





1 Superscripts (1), (2), (3) and (4) over N: and N3 are used to distinguish Ni and Ns values for the four situa- 
tions. 
* However, this condition may not hold exactly in practice. For example, suppose Ni =30, ni: =8 and hence, 
pi=8/30 =.27. Then if 6: =5 and a; =25, the rate of disease in b; would have to be one of the values 0, .2, .4, .6, 
.8 or 1.0, and cannot in fact, be equal to p: =.27. Similarly, .27 is not a possible value for the disease rate in a1. 

This discrepancy should, of course. become less critical as the sizes of a: and 6; increase. 
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TABLE 1. CALCULATION OF OBSERVED DIFFERENCES 


IN RATES FOR FOUR SITUATIONS 








Rate for N:1 


Rate for N: 


Situation (1) 
(1) 

Ni =ai+h 

NS? was+bs 


Situation (2) 
,(2) 
Ni =a: 


2 
NS" war+tr+br 


Situation (3) 
Ni =ar+bi+bs 


(a) 
Ni =a: 


Situation (4) 
4 
Ni watts 
nr” =art+h 





PA 


p2(a2+b2) +mibi 
a2+be +h 


pi(ai +bi) +p2b2 


pra + pebs 





ai+bit+be 


pa 


ath 
pra +pibi 
ath 


(a2 +b) 


1. =d --—__—__—_ 


(a2 +b2+bi) 


(a1+b:) 
(a1 +b: +62) 


(a1a2 — bibs) 
(a1a2 +a1bi +a2b2 + bibs) 





Observed difference in rates 
(Rate for VN: —Rate for N2) 


d,=d = =d 

















it can be shown algebraically that Situation (4) always gives the smallest ob- 
served difference in rates (see Appendix Proofs I and II). 

A comparison of the observed difference in rates for Situation (2) with Situa- 
tion (3) shows the following: 


1) If a,=a, and b;=be, then d.=d;. 


2) If a;<az and b; <be, then d: is always larger than ds. (Appendix Proof IIT) 
3) If a;>az and b;>be, then d is always smaller than d;. 


Suppose now that the observed difference in each of the four extreme situa- 
tions is to be tested for statistical significance. For convenience we restrict 
ourselves to the normal approximation to the binomial written in the following 
way: 


K = (pr - P2) VNW2/VNp(1 » p) where 2: = ni/Ni, P2: = N2/No, p= n/N. 


Assuming as above that (pi— p2) =d, where d>0, what will be the observed value 
for K in each of the extreme situations? If we let M = (pi:—p2)/-Np(i—p) our 
four values of K (labelled Ki, Ke, Kz, K4) are: 


r( a, + be 
' az + bz + bi 


u( en ) Gage oe oe ST 
= M| —————- } :V(a a 
a; + b; + bg . ‘ , : 





a;d2 — bibs 


K, u( 
142 + ayb; + debe + bids 





) oes 


It can be shown that K;, is always the largest and K, the smallest of the four 
values. (See Appendix Proofs IV-VII.) This implies, then, that under our as- 
sumptions measurement errors of the types described tend to increase the prob- 
ability of making the Type II Error. 
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A comparison of K: with K; shows the following: 


1) If a, = de and b; = be then K.=K;3. 

2) If a:<az and b;<b: then Kz need not be greater than K; in spite of the 
fact that d.>d;. However, if we add the condition that a:b.>a2b, then 
K:>K;. (Appendix Proof VIII) 

3) If a:>az and b,>b, then Ke need not be smaller than K;, although 
d;<d;. If we add the condition that a,b2<a2b; then K2<K3. 


To illustrate some of the above points, a numerical example has been con- 
structed and is presented in Table 2. In this example, we wish to relate the 


TABLE 2. NUMERICAL EXAMPLE 








Situation (1) Situation (2) Situation (3) Situation (4) 
NY mai +b, Ny =a, N‘* =a, +bi +b: N® =a, +b: 
NYP? =a2t+b2 | NS? =a2+be+b: | NS? =a, N mas+b; 





N 10 ,000 10 ,000 10 ,000 10 ,000 
N, 1,000 900 1,900 1,800 
N2 9 ,000 9,100 8,100 8,200 
P -01100 .01100 -01100 -01100 
Pr .02000 .02000 -01526 -01500 
P2 -01000 -01011 -01000 -01012 

d=pi—pz -01000 -00989 -00526 .00488 
K 2.88 2.71 1.98 1.80 

















a,= 900 

a: =8100 

bi= 100 

b= 900 
presence or absence of generalized cyanosis among newborns in the first week 
of life, with the presence or absence of brain damage at age 7 years. Starting 
with 10,000 babies (NV), suppose 1,000 (NV) are cyanotic in the first week of life 
and 9,000 (N2) are not; of the 1,000 with cyanosis, 2% (p:) show evidence of 
brain damage at age 7 and of the 9,000 without cyanosis, 1% (pz) show evi- 
dence of brain damage at that age. This is Situation (1), the true situation. 

Now consider the composition of N;. Assume that 900 (a,) of the 1,000 
cyanotic babies are unmistakably cyanotic and that the remaining 100 (b;) 
babies for one reason or another may be mistakenly classified as not cyanotic. 
Similarly assume the 9,000 babies comprising N: include 8,100 (a,) babies un- 
mistakably negative and 900 (be) babies who may be erroneously classified in 
N,. Situations (2), (3) and (4) represent the possible extreme error situations. 

In this example note that a; <de, bi <2, and aibe=a2h;. We find that d,>d:> 
d;>d, and K,>K,>K;> Ky. Measurement errors have reduced our chances of 
detecting a difference where one in fact exists. 


APPENDIX 
For all proofs 
N, = a, + bi, No = a2 + be, S = aya2 + yb; + a2b2 + Dido. 


I) To prove: d:>ds. 
By computation, 
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N2:S > a,a2(Nz2 + b;) 
“ N2/(N2 + bi) > aua,/S. 
Since 
a,a2/S > (a1a2 ead byb2) /S. 
We have 
dy = dN2/(N2 + b:) > d(aiaz — bibz)/S = dy. 


II) To prove: ds>d,. 
By computation, 
N,-S > aya2(N; + 52) 
“. Ny/(Ni + b2) > aya2/S. 
Since 
a;02/S > (aya, — byb2)/S. 
We have 


d, = adNi/(Ni + be) > d(a;a2 - b,b2)/S = dy. 


III) To prove: If a;<a, and b; <b; then d:>d;. 
Since N2>N; it follows that NiN2+ Nobe> NiN2t+ Nib:. 
Dividing by (N2+b:)-(Nitb2)/d we have 


dy = dNx/(Nz + bi) > dNs/(N1 + bs) = ds. 
IV) To prove: Ki> Ka. 
b,NN: + Ni(Ni — a;) > 0 
. BNNs + NN? — aN; > 0 
 NiN2 > aiN3/(N2 + by) 
. VNiNa > NevVai(Na + 61)/(N2 + bi) 
Ky = MVNiN2 > M-(NaVai(N2 + b1)/(No+bi)) = Ke. 
V) To prove: Ki>Ks. 
b.N iN: + Ni(N2 — a2) > 0 
2. byNNs + NiNg — aN > 0 
. Nig > agNi/(N1 + bs) 
oo VNIN2 > Niv/az(Ni + b2)/(N1 + bn) 
. Ky = MVNiN2 > M-(Ni/(Ni + ba) Va2(Ni + ba) = Ke. 


VI) To prove: K2> Kg. 
By computation, 


a,N3S > aias(Ns + bi) 
o NeVai/VN2 + bi > aras/VS. 
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Since 
@;02//S > (aya2 — byb:)/VS. 


We have 


K; = M(N2/(N2 + b:))Vai(N2 + b;) 
> M((ayaz — bibs) /S)V (ai + be) (a2 + b1) = Ky. 


VII) To prove: K3> Ky. 
By computation, 





a,NiS > ayas(N; + bs) 
o NivVae/V(N1 + b2) > aiar/VS. 


Since 


@:02//S > (ayaz2 — bibs)/VS. 


We have 


Ks = M(Ni/(Ni + b:)) Va2(Ni + be) 
> M((aia2 — bibs)/S)V/ (ai + 62) (a2 + bi) = Ka. 


VIII) To prove: : 

a) if a;<az and b; <b; then K; need not be greater than Ks 
b) if a; <a, b; <b, and ayb,> ab; then Ky> Kz. 

Proof of 
a) let a;=100, a2=300, 6: = 9900, b,=10,000. By computation, 

K.< Ks. 

Proof of 

b) Under the given conditions the following hoids: 





ads + abs + 2arazbs + ayasbs + aybsds + ayasbib2 + ayazbe 
+ arbs + 2arasbs 
> ays + asb; + 2ayasbs + ayasb; + a2b; + 2arasd; + ayarbs 
+ asbib2 + 2arasb,bs 
2. (aia + aybs + ayasbs) (a; + bs + bs) 
> (ayas + asd; + 2ayagbs) «(az + bi + by). 
Dividing by (a:+b:+b2) (a2+bit+b:2) and simplifying, it follows that 
(N2/(N2 + b:))?*(ax)(N2 + b1) > (Ni/(N1 + 52)? (M1 + 2) (a2). 


Taking square roots, and multiplying by M, we have K:>Ks. 





AN ANALYSIS OF CONSISTENCY OF RESPONSE 
IN HOUSEHOLD SURVEYS 


Carou M. JAEGER AND JEAN L. PENNocCK 
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United States Department of Agriculture 


In obtaining data through household surveys, one of the primary con- 
cerns is the ability of respondents to recall certain facts. Data relating 
to a specific household appliance collected from identical households on 
identical questions in two national surveys made approximately one 
year apart form the basis for the present analysis of consistency of 
response. Although there is no method of checking the accuracy of the 
responses, the effects of differing responses are examined. 


1, INTRODUCTION 


NE of the areas of concern to all who use data based on responses made by 
Q individuals is the accuracy of those responses, and the effect of inaccura- 
cies on a resulting statistic or conclusion. Much work has been done in the 
fields of questionnaire construction and interviewer training,' but regardless 
of the skill with which the interviewing is done, the accuracy of the final result 
rests with the respondent and his ability to recall the needed facts. In the 
field of consumption data it is difficult to test the accuracy of reporting, par- 
ticularly when events in the past such as the acquisition of goods are involved.? 
In the absence of evidence corroborating the responses, information on con- 
sistency of response may be available and ™> prove useful. While consistency 
between sets of responses to the ~ . vi cannot be assumed to indicate 
that the responses are accurate, consistency does indicate that at least one 
set is inaccurate. Ferber has reported ~» ar i>~cstigation of the consistency 
among responses of family memb.  -_ ,uesi ons put to them simultaneously.* 
The present investigation reports on another facet of consistency, that be- 
tween reports obtained over time from the same household. Using data ob- 
tained in two surveys made primarily for another purpose, it has been possible 
to analyze the consistency of responses made approximately a year apart by 
the same households on identical questions relating to the characteristics of 
owned electric washing machines. Since it was impractical because of cost con- 
siderations to designate the respondent for the main surveys, the investigation 
of consistency of response was limited to a household basis. Had the choice of 
respondent been controlled, the degree of inconsistency reported below would 
undoubtedly have been reduced. 

The primary purpose of the surveys used here was to establish a set of data 
from which to compute service-life expectancies (under one owner) of house- 





1 For a partial list of articles dealing with these subjects see: “A Basie Bibliography on Marketing Research” 
compiled by Hugh G. Wales and Robert Ferber, American Marketing Association Bibliography Series, No. 2, 
June 1956. 

2 An investigation of the accuracy of responses on purchases in the immediate past is reported by: Mets, 
Joseph F., Jr., “Accuracy of Response Obtained in a Milk Consumption Study,” Methods of Research in Marketing, 
Paper No. &, Department of Agricultural Economics, Cornell University Agricultural Experiment Station, July 
1956. 

* Ferber, Robert, “On the Reliability of Responses Secured in Sample Surveys.” Journal of the American 
Statistical Association, 50 (1955), 788-810. 
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hold appliances and furnishings. Because previous analyses* had indicated 
that it would be necessary to have large samples so that the life expectancies 
would have a reasonable reliability, arrangements were made with the Bureau 
of the Census to collect the data as a supplement to the Current Population 
Survey. This supplement covered one-half of the total CPS sample, or approxi- 
mately 17,500 households, and was nationwide in scope.® 

The plan of work called for a series of collections, each covering several items. 
The three items of the first collection were to be repeated, one at a time, in 
succeeding collections. Since the CPS uses a rotating sample, the spacing of 
the first two collections made it possible to have one-fourth of the sample 
households that were in the first collection (January) appear in the second 
(December). 


2. THE SCHEDULE, DEFINITIONS AND SAMPLE 


The questions asked were factual in nature and were in as direct and simple 
a form as possible. The pertinent section of the supplement is reproduced in 
Figure 1. 

Interviewers were trained in the usual manner by the Bureau of the Census, 
but there was, in addition to the regular instructions, a section covering the 
household equipment supplement. Included in the instructions were the fol- 
lowing definitions for electric washing machines: 


A manually-operated machine is defined as one with a wringer or with a spin- 
dryer. The operator is required to control the entire operation of the mechanism 
including the filling and emptying of the machine. 

A fully automatic machine is one which goes through its complete cycle of washing, 
rinsing, and damp-drying operations without further attention after the controls 
have once been set. No resetting of controls is necessary—and no special control is 
exerted on the filling or emptying of the machine. 

A semiautomatic washing machine requires an original setting of controls—and 
then at least one resetting of the controls, usually after its suds-washing operations 
in order to fill or empty the machine of water. Thus, it differs from the automatic 
machine largely because of the additional attention and resetting required by the 
semiautomatic type. And the basic difference between the manually-operated electric 
washing machine and the others is that it not only requires resetting of controls, but 
it also requires some manual handling in passing the clothes through a wringer or a 
spindryer. 


Instructions were given the interviewers that if the respondent was uncer- 
tain of his facts, for example, the date of purchase, they were to try to find out 
whether there was anyone else in the household who knew the information 
more precisely and when that person was likely to be home. Arrangements 
could be made to telephone that person for the information if the household 
had a phone; otherwise the interviewer was instructed to try to call back at the 
household at the time the person was expected to be at home. 

The one-fourth sample in each collection amounted to 4,049 households. 
But 1,094 households were unable to provide comparable information for both 





4 Pennock, Jean L. and Jaeger, Carol M., “Estimating the Service Life of Household Goods by Actuarial 
Methods,” Journal of the American Statistical Association, 52 (1987), 175-85. 

‘ For details as to the sample design, see Bureau of the Census, Current Population Reports, Series P-23, 
No. 5, May 9, 1958. 
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Fia. 1. Section of schedule used in the CPS supplement for electric washing machines. 


surveys because of movement of families into or out of sample addresses or the 
occupants were not found at home after repeated calls or were unavailable for 
other reasons. Of the 2,955 identical households, 622 reported that they did 
not own a washer in either period, while 243 gave information on washers only 
in one of the two surveys.* The remaining 2,090 identical households gave in- 
formation on washers in both surveys. 

In this paper a report of ownership of a washing machine is called the inven- 
tory. An owned washing machine disposed of during the preceding year is 
called a discard. For the immediate purposes of comparison of responses, con- 
sistent reports could be only 1) those where the January inventory is the same 
as the December inventory and there was no discard during the year, or 2) 





* Of the 243 reports, 49 were of such a nature that they could be termed “consistent” while 194 were “incon- 
sistent.” 
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those where the January inventory is the same as the December discard. A 
discard also could have been reported in the January collection and a purchase 
made between the January and December collection but this situation gives no 
basis for the analysis of consistency in reporting because of the reference to 
two machines rather than to one machine. The report from one household fell 
into this category, thus reducing the number for analysis to 2,089. 

In examining the two sets of responses, the reports were classified first by 
kind of washer, second by condition when purchased (new or used), and then 
by year of purchase. Of the reports from the 2,089 households only 598 were 
consistent with respect to all three categories, while 1,491 were inconsistent in 
one or more categories. Because of the impossibility of classifying some of the 
inconsistent reports, the total number of tabulated reports is 2,079. 


3. CLASSIFICATION OF INCONSISTENCIES 


The different types of inconsistencies shown in Table 1 summarize the rela- 
tionship between the report made in January with that made in December for 
each of the three categories of kind, condition, and year of acquisition. 


TABLE 1. SUMMARY OF PAIRED REPORTS FROM TWO SURVEYS ON 
OWNED WASHING MACHINES INDICATING CONSISTENCY OF 
REPORTING KIND OF MACHINE, CONDITION AT TIME OF 

ACQUISITION, AND YEAR ACQUIRED 








Categories 








Kind 


Condition 


Year 


Number 





Same 
Same 
Same 
Different 
Same 
Different 
Different 
Different 


All 





Same 
Same 
Different 
Same 
Different 
Same 
Different 
Different 





Same 
Different 
Same 
Same 
Different 
Different 
Same 
Different 





598 
1,068 
28 

44 
131 
170 

5 

35 


—— 


2,079 





For “condition” and “year” the total number of comparisons was less than 
the 2,079 tabulated reports because of a report of “not known” in one or the 
other of the two surveys. 

In order to compare the number of consistent reports with the number of 
inconsistent reports for each of the categories, the following summary is given: 








Consistent 


Inconsistent 


Total 





Kind 
Condition 
Year 





No. 
1,825 
1,880 

675 





No. 
254 
150 

1,348 





No. 
2,079 
2,030 
2,023 








324 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1961 


4. ANALYSIS OF REPORT OF YEAR OF ACQUISITION 


As would be expected, the greatest proportion of inconsistencies are in the 
report on the year of acquisition. A distribution by the number of years of dif- 
ference between the two reports, regardless of period (Table 2), shows about 33 
per cent reported the same year in the two surveys; another 31 per cent dif- 
fered by 1 year; 12 per cent differed by 2 years; 7 per cent by 3 years; 3 per 
cent by 4 years; and 4 per cent by 5 years. The remaining 10 per cent varied 
over a wide range of years with no large percentage in any particular year. 
When the reports are considered with respect to the direction of change, the 
only appreciable difference is in the group having one year’s difference. 


TABLE 2. DISTRIBUTION OF REPORTS BY NUMBER OF YEARS OF 
DIFFERENCE IN THE REPORTING OF DATE OF ACQUISITION 
IN JANUARY AND DECEMBER SURVEYS AND 
DIRECTION OF CHANGE 








Number of 
years of 


Karlier year of 
purchase reported 


Later year of 


Distribution of all purchase repented 


difference 


reports 


in December 


in December 








Number 
675 
617 


Number 


Per cent 


50 
19 


Number 


251 


Per cent 


40 


116 19 
12 
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NUR WORE ORE, 





15 or more 2 





Total | 2,023 























* Less than 0.5 of 1 per cent. 


The relationship between recency of purchase and the amount and direc- 
tion of change in the date of purchase was also investigated. Examination of 
Table 3 shows a fairly consistent pattern of change in the report. If the machine 
was reported to be five years old or less at the time of the first interview, a 
change in the report of year of acquisition at the time of the second interview 
was likely to make it older rather than younger (in other words, to put the 
year of acquisition earlier rather than later) but the reverse was true for 
machines originally reported to have been more than five years old. However, 
there seems to be a fair amount of balancing out in the differences in year of 
acquisition. 
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This impression is reinforced by the fact that the correlation coefficient be- 
tween dates of acquisition in the two sets of reports is 0.81. Also, the average 
age computed from the December reports is the expected one year greater 
than that from the January reports. 

It appears then, that although there was considerable difference in the report 
of year of acquisition, the magnitude of the differences in report was small in 
the majority of the cases. There is evidence that in a sample of the size reported 
on here, the balancing out of the differences helps reduce the net effect of the 
inconsistencies. 


5. ANALYSIS OF REPORT OF KIND AND CONDITION 


The reporting of the categories of kind and condition are summarized below. 
Consistent reports are underlined; other entries in the body of the table are the 
inconsistent reports. 


KIND 








January report of: 





December 
Semi- Wringer and totals 


Automatic | . tomatic spin dryer 





DECEMBER report of: 
Automatic 743 : 44 827 


Semiautomatic 20 40 106 
Wringer and spin dryer F 1,146 











JANUARY totals § 2,079 




















CONDITION 








January report of: 
December totals 





New Used 





DECEMBER report of: 
New 1,641 69 1,710 
Used | 81 239 } 320 


| 
i! 

















= 


JANUARY totals | 1,722 308 | 2,030 








Of the group as a whole 12 per cent changed the report of kind of machine. 
Only 7 per cent (56) of those reporting an automatic washer and 8 per cent 
(89) of those reporting a wringer or spin dryer machine in January reported 
them as a different kind in December. The proportion changing from a semi- 
automatic washer amounted to 70 per cent (109). The high percentage may 
be accounted for in part by the failure of many respondents to understand the 
distinction between the semiautomatic and the other washers. In other words, 
the problem may be one of definition and not one of ability to recall. 
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Inconsistency in the reporting of condition of the washer was even less im- 
portant than inconsistency in the reporting of kind. Only 5 per cent of those 
who reported a new machine in January changed this to a used machine in the 
December report, while 22 per cent changed from used to new. Because of the 
small number in the used class in both surveys, the total change affected only 
7 per cent of the reports. 


CONCLUSION 


From the analysis presented here, the inconsistency in nanitinn: although 
of considerable magnitude for some individual households, was of very little 
consequence for the sample as a whole. In the reporting of year of acquisition 
75 per cent of the responses varied 2 years or less. While there appears to be 
some pattern of change in the reporting of date of acquisition, there also tends 
to be some balancing out in the direction of these changes. Inconsistencies in 
the reporting of other descriptive data were relatively few except that it ap- 
peared that some respondents did not understand the term “semiautomatic 
washer.” The net effect of all inconsistencies in these data is not appreciable. 








RANDOMIZED ROUNDED-OFF MULTIPLIERS IN 
SAMPLING THEORY 


M. N. Murrtuy 
Indian Statistical Institute, Calcutia 
AND 
V. K. Serar 
Institute of Social Sciences, Agra 


In a large scale survey, where a non-self-weighting design is used, 
the work at the tabulation stage becomes time consuming and costly 
due to the large number of multipliers (inflation factors) involved in 
obtaining the estimates. In this paper a technique is developed to sub- 
stitute the multipliers by a very small number of multipliers called 
“randomized rounded-off multipliers” by a suitable randomizing process, 
thereby reducing the work at the tabulation stage. A suitable procedure 
has been suggested for determining the values of the randomized 
rounded-off multipliers which would minimize the increase in the vari- 
ance of the estimator. 


1. INTRODUCTION 


N LARGE scale sample surveys, where a number of characteristics of the 
I population are to be estimated and a large number of cross tables are to be 
prepared, the tabulation time and cost may be reduced substantially by reduc- 
ing the number of multipliers to three or four or even fewer. This is especially 


true if the organization in charge of the tabulation does not have the modern 
data-processing machines. The best thing would be to have a sampling design 
which leads to a single or a very small number of multipliers. Sometimes how- 
ever the design does not satisfy this condition because of various reasons. There 
may be a number of different ways of solving this problem at the tabulation 
stage. Some of these have been discussed by the authors in a separate paper 
[1]. The present paper deals with only one of these methods in detail. 

Here the multipliers are rounded off to a very small number of multipliers 
by a random process which would retain the character of unbiasedness of the 
estimators. A procedure is given to find the optimum values of the rounded-off 
multipliers which would minimize the increase in the variance of the estimator 
introduced by the random process. The problem of minimizing this increase in 
variance by a proper splitting-up of large multipliers is also considered. Further, 
a practical procedure is suggested to determine the number of rounded-off 
multipliers to be used to get a desired precision. 


2. FORMULATION OF THE PROBLEMS 


Suppose y; and a; (¢=1, 2, - - - , m) denote the value of the characteristic for 
the ith sample unit and its corresponding multiplier such that 


D ayi 
is an unbiased estimator of the population total Y. The problem is to find a 
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set of k rounded-off multipliers b:<bz< +--+ <b; (k being considerably smaller 
than n) such that 


B( > rat) = > aii (1) 


t=] i=] 


(Zr) 


is minimum, where r; is a random variable taking the values (bi, be, - - - , be) 
with certain probabilities and Z and V denote the conditional expectation and 
conditional variance respectively given the sample of n units. Further a bal- 
ance is to be struck between the increase in variance and amount of labour in- 
volved in obtaining and using these rounded-off multipliers. 

If there are a few very large multipliers it may be advantageous to split each 
of them into two or more multipliers before randomization at the tabulation 
stage. The problem here is to determine the number and the best set of split 
multipliers which should replace a large multiplier and to investigate the con- 
ditions, if any, under which such splitting would result in a reduction in the 
increase in variance. 


3. OPTIMUM SET OF ROUNDED-OFF MULTIPLIERS 


A necessary and sufficient condition for the equation (1) to hold for all 
values of y is that E(r;) =a; for all 7. Suppose the set of k rounded-off multi- 
pliers b; ({=1, 2,---, k) is given and that the range of this set includes the 
range of the original multipliers a; (¢=1, 2, ---, m). E(r;) will be equal to a; 
and V(r,) will be minimum if r; takes the values b; and bj,; nearest to a; on 
both sides of it (6;<a;<bj41) with probabilities, 

stile EO at 
bi41 — 0; bj41 — 6; 
respectively. This important result can be proved easily. 

Having decided the procedure of allocation of the rounded-off multipliers to 
the original multipliers, let us find the values of the rounded-off multipliers 
which would minimize the increase in variance of the estimator. The variance 
of the estimator 


> ri 
over the tabulation stage is given by 


v( > ra) = DL r—adar— ddyi 


tm bi saysbg 


k-1 
+ yo p> (bj41 — as)(a; — bys 


jw2 b;<agsbj+1 


when the r,’s are chosen independently for each 1. 
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The only term involving }, in the above expression is a decreasing function 
of b;. Hence the optimum value of }; is the minimum of a,’s. Similarly, consid- 
ering the term containing by, it is easy to see that that would be minimized for 
b. = maximum of the a,’s. Thus the optimum values of these two rounded-off 
multipliers, b; and b;, are easily found and are independent of the set of values 
of the characteristic y in the sample. 

The decision about the intermediate values is more difficult. It may be 
shown that for fixed b;_; and b;,:, the optimum J; satisfies 


> (a; - bs-a)yi 
¥ yi ene < * Yi° (3) 


bj<a; Sb5+41 (bj41 int bj_1) bj sajsbj+1 





If the equality sign on the left side of the expression (3) is satisfied, any value 
in the interval a, <b;<ax4; may be taken as the optimum value of b;. For ob- 
taining };, all the units having the original multipliers between bj, and bj: 
may be arranged in decreasing order of the multipliers. Then the cumulative 
sums of yj are to be determined from the top and the multiplier of the unit 
where the cumulated sum is equal to or just greater than the value of the 
middle expression in (3) is to be taken as the value of b,. 

The av ‘ual solution for the set of b,’s are difficult to obtain if k is greater 
than 3. In such cases it would be advisable to adopt an iterative process to get 
a near optimum solution. The iterative process consists in starting with an 
arbitrary set of b,’s and revising each of them by fixing the b,’s on either side 
of it. For instance, in case of k=4, b; and by are taken as the least and the 
largest multipliers, respectively, and some arbitrary values are attributed to 
b, and b;. Then fixing b, and b;, a new value bj is determined in an optimum 
manner. A value bj is then determined between bj and by. This process is con- 
tinued till two successive values of the second and the third rounded-off multi- 
pliers agree. This iterative method, though fairly simple for k=4, becomes 
quite cumbersome for larger values of k. But use of modern tabulating machines 
would considerably reduce the difficulties in using this iterative process. 


4, ILLUSTRATION 


A trivial example is considered to illustrate the above procedure. Suppose 
the multipliers are 1, 2, 4, 4, 6 and y;=1 for all «. The object is to find three 
rounded-off multipliers such that the variance of the estimator of the total of 
these obtained by replacing them by rounded-off multipliers is the minimum. 
The first and the third multipliers will be 1 and 6, respectively. In this case, the 
middle expression in (3) is 2.4. After arranging the multipliers in decreasing 
order, it is found that the first two units having multipliers 6 and 4 make the 
right side of (3) equal to 2 and the first three units having multipliers 6, 4, and 4 
make it equal to 3. So 4 can be taken as the second rounded-off multiplier (bz). 
It can be seen that b,.=4 gives the minimum variance 


value of bz 2 3 4 5 


variance Ss 7 2 5 
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5. ESTIMATE OF INCREASE IN VARIANCE 
The variance of 


> (riya) 


over the tabulation stage is an unbiased estimator of the increase in variance of 
the estimator. Hence this itself can be taken as the criterion for deciding the 
number and values of the rounded-off multipliers to be used to replace the 
original multipliers. 


6. NUMBER OF ROUNDED-OFF MULTIPLIERS 


An increase in k, the number of rounded-off multipliers, would result in a 
decrease in the increase in variance and an increase in the tabulation time and 
cost. So one should try to minimize k so as to attain a given precision. The fol- 
lowing practical method would help in finding the minimum number of rounded- 
off multipliers required to achieve the desired precision. One may start with 
k=2. If the estimate of increase in variance as given in section 5 is greater than 
the value decided upon, only then the third rounded-off multiplier between 
the smallest and the largest multipliers should be chosen by the iterative 
method. If the estimate of the increase in variance is still greater than the pre- 
determined value, one more rounded-off multiplier is to be determined by the 
iterative process starting from that part which contributes more to the increase 
in variance. This process is continued till the desired precision is achieved. 

In this connection it may be worthwhile to round off the original multipliers 
to two significant figures. Tukey [2] has shown that rounding off the weights to 
two significant figures adds very little to the variance. We may even use the 
smaller set of numbers suggested by him, namely, 10(1)20(2)50(5)90. Using 
this smaller set of multipliers may reduce the burden of finding the optimum 
set of rounded-off multipliers and deciding the number of such multipliers would 
be comparatively simpler. To retain unbiasedness, one may round off the origi- 
nal multipliers to this optimum set of rounded-off multipliers by the random 
process. 


7. LARGE MULTIPLIERS 


In this section a solution for a particular case (where all y;’s are equal) of 
the second problem stated in section 2 is given. Let there be n multipliers of 
which a, is the smallest, a, the largest and a,_; the next largest. Suppose it is 
required to have only two rounded-off multipliers. a; and a, will be these 
rounded-off multipliers. Let us replace the largest multiplier by xz and a,—z, 
both lying between a; and a,_:, which is possible if a, lies between 2a; and 
2an-1. 

When a, is replaced by x and a,—z2, the rounded-off multipliers are a; and 
a,-1. The contribution to the variance from the multipliers x and a,—z is 
minimized for t=a, if a,<a:+a,-1 and for r=a,-1 if @, >a1+4y-1. 
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Theorem: When there are only two rounded-off multipliers, there will be a 
reduction in variance if the largest multiplier is replaced by two multipliers 
according to the rule given above if 


(i) Gy > $01 + On-1 


or 


S an = 2a; 
(ii) Gn <4; + Gy-1, but Z a;> afin +1)+ = . 


Gn — An-1 


t=1 


For the second case, that is, when a,<a,;+a,-:, the given condition is both 
necessary and sufficient. 


8. REPEATED MULTIPLIERS 

Suppose the multiplier a; is repeated m times and the corresponding values 
are yi;,j3=1, 2, - - - , m. The question arises as to whether we should take the 
same value of r; to replace all the repeated multipliers or we should choose the 
value of r; separately for each of them. In the former case the contribution to 
the increase in variance from the multiplier a; is 


( } vs) Vero 


and in the latter it is 


YE viV (rs), 


jul 


where V (r;;) = V(r;) for all values of 7. If all y;,’s are positive, as they generally 
are, it will be better to round off the repeated multiplier separately for each of 
the units. 

The following modification may give still better results. Suppose an a; lying 
between b; and bj; is repeated m times. The corresponding r,’s take the values 
b; and b;,; with probabilities p; and 1 —p;, respectively. Instead of rounding off 
the multipliers independently, we may put the restriction that the number of 
r;’s taking the value 6; will be restricted to [mp,] and [mp,+1] with probabil- 
ities [mp;+1]—mp; and mp;— [mp;], respectively. This method is the best for 
the case where all y;;’s are equal. 


9. AN EXAMPLE 


To try out the technique developed in section 3, the data on consumer ex- 
penditure on cereals collected in a large scale survey were used. Here a; and 
y; stand for the actual multiplier and expenditure on cereals of the ith sample 
unit. The a,’s varied from 20961 to 78993. Hence these two numbers were 
taken as the two extreme rounded-off multipliers b; and b;, respectively. For 
finding a suitable value for bz, a sub-sample of size 30 was taken from the sample 
with equal probability. Using the technique developed in section 3, the value 
of b. was found to be 57600. The increase in the variance of the estimator of 
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the total expenditure on cereals was calculated and was found to be 6.67% 
of the estimate of the variance of 


to aiYi. 
i=l 


This meant only 3.28% increase in the coefficient of variation of the estimator. 

To test the utility of the additional multiplier bz, the increase in variance was 
calculated on the basis of two multipliers b; and b;. This turned out to be 
about 170 times the increase in variance for three rounded-off multipliers. This 
demonstrates that considerable reduction in the increase in variance has been 
achieved by the additional multiplier be. 


10. SOME ALTERNATIVE PROCEDURES 


Sometimes it may be desirable to use a ratio estimator of the form 


TYi | 
= > a; 


t=1 
7 ri 
t=1 
instead of the unbiased estimator, 


n 


> TYi. 


t= 1 


The ratio estimator is likely to be more efficient than the unbiased estimator 
if the variation in y;’s is much less compared to that in a,’s. The efficiencies of 
some alternative procedures of making the design self-weighting at the tabula- 
tion stage have been studied empirically by the authors. The procedures con- 
sidered are: 


(i) rounding off the multipliers to the nearest hundred, thousand, or ten 
thousand as the case may be; 
(ii) substituting the mean of the multipliers for all the multipliers; 
(iii) sub-sampling with probability proportional to multipliers with replace- 
ment; 
(iv) sub-sampling with probability proportional to multipliers systemati- 
cally. 


All these procedures are operationally simpler than the procedure given in 
section 3. The procedures (i) and (ii) give biased estimators with possibly a de- 
crease in variance under certain circumstances. In all such cases where the 
value of the character has small correlation with the multipliers, this bias is 
likely to be very small. Actually the variance of the estimator in the second 
alternative is very often smaller than that of the original unbiased estimator. 
Besides this is the simplest way of solving the problem of multiple multipliers 
and it reduces the tabulation cost to a minimum, employing as it does only a 
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single multiplier. The methods (iii) and (iv) give unbiased estimators and lead 
to a set of multipliers of the form r, 2r, 3r, - - - etc. With a suitable arrange- 
ment of the units, the procedure (iv) is likely to give rise to a negligible increase 
in variance. 
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BIVARIATE LOGISTIC DISTRIBUTIONS 


E. J. GuMBEL 
Columbia University* 


The logistic distribution closely resembles the normal one. Both are 

symmetrical. Here two logistic bivariate distributions are studied. In 
both cases the curves of equal probability density are not ellipses, the 
regression curves are not linear and the conditional expectations are 
limited. The first distribution analyzed with the help of the bivariate 
moment generating function is asymmetrical and therefore departs 
considerably from the normal one. The coefficient of correlation is con- 
stant and equal to one half. The second bivariate logistic distribu- 
tion is symmetrical. The regression curves are linear in probability 
scale and the coefficient of correlation varies in the interval +.30396. 


INTRODUCTION 
HE study of bivariate distributions was long confined to the normal case. 
Where the regression curves turned out not to be linear, quadratic and 
higher terms were introduced to account for the departure from the classical 
theory. Instead it seems appropriate to study bivariate distributions where the 
marginal distributions, namely the logistic are similar to the normal one and 
to compare their properties to those of the classical bivariate normal one. 
1, GENERAL PROPERTIES OF BIVARIATE DISTRIBUTIONS 
A bivariate probability function F(z, y) with given margins F;(z) and F;(y) 
is such that 
F(z, a ) = F(x); F(«, y) = F,(y). (1.1) 
It satisfies the boundary conditions 
F(-«, = F(z, -«©) = F(—«, —~) =0 
( y) (2, ) ( ) (1.2) 
F(, ©) = 1. 
The content of the rectangle x, 22, ¥1, Y2 must be positive, i.e. 
F(x2, yo) + F(a, ys) & F(e2, ys) + F(a, ys). (1.3) 
If the second cross partial derivative 0?F (x, y)/dxdy exists everywhere, the 
bivariate distribution is said to have a density f(z, y), defined by 
0°F (x, y) 
Oxdy 


f(z, y) = (1.4) 


In this case (1.3) holds if and only if f(z, y) >0. The conditions (1.1), (1.2) 
and (1.3) are necessary and sufficient conditions that F(z, y) be a bivariate 
probability function with the margins F(x) and F2(y). 





* Work done in part under a grant from the National Science Foundation. Thanks are due to Mr. Simeon 
Berman for his help. 
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The two variables are independent if and only if 
F(x, y) = Fi(x)F2(y). (1.5) 
A bivariate distribution having a density f(z, y) is called symmetrical about 
(0, 0) if 
f(z, y) = f(—2, -y) and f(—z,y) = f(z, —y). (1.6) 


A necessary but not sufficient condition is that the marginal distributions are 
symmetrical about zero. To prove that a bivariate distribution is asymmetrical, 
it is sufficient to show that the first equation in (1.6) does not hold. 

The joint probability P(x, y) that the random variable X exceed x and the 
random variable Y exceed y is 


P(x, y) = 1 — Fi(z) — Fr(y) + F(@, y). (1.7) 
If the marginal distributions are symmetrical about zero, it follows that 
P(O, 0) = F(O, 0). (1.8) 
The conditional density functions are defined as usual by 
f(z| v) =f@, W/W); fy| x) = S@, »)/fa2). (1.9) 


The regression analysis is facilitated by the introduction of the conditional 
moment generating function G(t| y) defined as 


+00 
G(ts| y) =f ernf(x| y)dx (1.10) 


and of the bivariate moment generating function G(t, f) defined as 
+00 +00 
G(t;, te) ={ f f(x, yet *1"dzrdy. (1.11) 


If the conditional moment generating function (1.10) has been evaluated the 
bivariate function is obtained as 


Git, 4) = f S(yer"™G(ty |y)dy. (1.12) 


The two moment generating functions may be used to compute the conditional 
and unconditional moments, the coefficient of correlation and the correlation 
ratios n(z|y) and n(y| 2) defined by 


n°(a| y) = o-*(y) f (E(e| y) — E(@))%fe(yddy (1.13) 
= o*(y)E[E(x| y) — E(z)}? 


and similarly for (y| x). 

Even if the condition (1.1), (1.3) and (1.4) are met, a bivariate probability 
function is not uniquely determined by its margins. On the contrary, Fréchet, 
[2] has shown that there is an infinite number of solutions to this problem. 
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In the following three bivariate distributions will be shown where the 
marginal distributions are logistic functions. To simplify the derivation the 
formulae will be written in reduced form. 

In the first case the correlation is constant while this restriction does not 
hold in the two other cases considered briefly at the end of this article. 

2. THE FIRST BIVARIATE LOGISTIC DISTRIBUTION 


The univariate logistic distribution in its reduced form is defined by the ex- 
pressions 


F(z) = (1+e*)"; f(z) =e*(1+e*)?; -xo <2< oa. (2.1) 
This distribution is symmetrical about zero and possesses the following proper- 
ties 
dx 1 
aF FQa—F)’ 


o? = 47/3. 


f(z) = F@)( — F@); t = Ig F — Ig (1 — F) 


(2.2) 


Its graphical representation resembles closely the normal distribution. 
The moment generating function G,(¢) is 


G.d) = TU + 8rd — 2d). (2.3) 


The parametric form of this distribution 


F(z) -_ {1 + e~ (at8z)|—1 (2.4) 


is often used in biometric research. A vast literature, see for example, Berkson 
[1] and Winsor [7] exists on the problem of the estimation of the parameters. 

A bivariate logistic distribution is such that the two marginal distributions 
are logistic. This holds if we put in non-parametric form 


F(z,y) = [1 + e* +e)“. (2.5) 

It may be noted that no mixed member exists. The analysis with respect to 
the two variables (X, Y) is facilitated by the relation 

F(a, y) = Fly, z). (2.6) 

The function (2.5) cannot be split into the product of the marginal functions. 


Therefore the variables are not independent. At the origin it follows from (1.8) 
and (2.5) that 


F(0, 0) = 4 = P(0, 0). (2.7) 
‘ 
1/6 ; 1/3 

| 


1/3 | 1/6 
| 


Grapa 1 
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The contents of the four quadrants are given in the schematic Graph 1. Curves 
of equal probability 


F(z, y) = constant (2.8) 


are traced in Graph 2. Some probability points are given in Table 1. 


Probability Points for the Bivariate Logistic Distribution 
: Probability F (x) 
as 


' 
x 


GRAPH 2 


TABLE 1. THE BIVARIATE LOGISTIC PROBABILITY FUNCTION F(z, y) 
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8. THE DENSITY FUNCTION 


The density function f(x, y) is obtained from (1.4) and (2.5). The first dif- 
ferentiation of (2.5) with respect to x and y yields 
OF oF 
— = Fe — = F%e-v, 
Oz oy 


Consequently the bivariate density function is 
f(z, y) = 2F*(x, ye (3.1) 


while the marginal density functions are given in (2.1). Since f(z, y) >0, condi- 
tion (1.3) is verified. In contrast to the marginal function, the joint density 
function cannot be expressed by the joint probability function alone. 
Although the marginal distributions are symmetrical the bivariate function 
is asymmetrical. For the proof it is sufficient to consider «= y. Then 
2z Qe-2 


a ae) | f(z, 2x) = G+ 20)" 





f(-—z, —2) = 


The two expressions are equal only in the trivial case z=0. To see which ex- 
pression is larger we compare 


e** +- Ge* + 12 + 8e-* to e~* + Ge-*= + 12 + Be? 


4/e* — e-**] to (e* — e7*). 
Since 
sinh 2x > 2 sinh z; zr>0 
it follows that 
S(-—z, —z) > f(, z). (3.2) 


Therefore, with the exception of =0, y=0, the probability in the quadrant 
above (zx, x) is smaller than that in the quadrant below (—2z, —z). 
To obtain the curves of equal density 


f@,y) =e (3.3) 


we note that the unique maximum is located at x=0, y=0 and has from (3.1) 
the density 


f(O, 0) = 2/27 = .074074. 
Therefore the interesting values for c are 
ce = .01(.01).07. 
The values of the variable z along the diagonal y= —z corresponding to the 
density c are from (3.1) the solutions of 
2(1 + e* +e) =; 








340 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1061 


hence, 
e* + e* = (2/c)'/8 — 1. (3.4) 


The corresponding values +2, easily obtained from tables of the hyperbolic 
functions, are given in Table 2. 

Along the line y=2z the values of x corresponding to the density c are the 
solutions of 


Qe-**(1 + e* + e-*)% =e 


which leads again to (3.4). The same holds along the line y=2/2 if x in equa- 
tion (3.4) is replaced by (x/2). Therefore Table 2 can be used to plot the points 
of equal probability density for y= —z, y=2z and y=2/2. 
Along the abscissa x=0 and along the diagonal z= y, the density points cor- 
responding to c are two of the real solutions of the cubic equations 
9 9 


eve (2+ 6%); er = (1 + Qe)’, (3.5) 
c c 


The results are also given in Table 2. The last two lines illustrate the inequality 
(3.2). 


TABLE 2. POINTS OF EQUAL DENSITY 








Density c .O1 .02 ; .04 .05 .06 .07 
Variablez=—y +1.533 +1.208 +. + .805 + .637 +.463 +.238 


2-0} 3.180 2.406 . 1.524 1.183 825 .366 


r=0 —2.417 -—1.929 ; —1.312 -—1.047 -—.724 —.329 
Variable | 2.400 1.917 ‘ 1.307 1.046 771 -404 
z=y f —3.156 —2.392 ; —1.522 -—1.181 —.842 —.432 





The curves of equal density f(z, y) =c, obtained by this procedure are traced 
in Graph 3, which illustrates the asymmetry (3.2). It may be compared to Graph 
4 which shows the corresponding bivariate normal distribution to be derived 
later. 


4. REGRESSION ANALYSIS 


For the regression analysis it is sufficient from (2.6) to study one variable, 
say x. The conditional density function f(z y) is from (1.9), (2.1) and (3.1) 
2e-*(1 + e-¥)? 


ly) = . 
f(z\ y) cioehce (4.1) 





Hence the conditional generating function G(th| y) defined in (1.10) is 
e-etetdz 
(l-evfe=)) 
We introduce a variable of integration z defined by 
l+e"+e* = (1+ e%)/z. 





+0 
G(ts| v) = 20 + ery f (4.2) 
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Equi-Density Curves for Bivoriate Logistic Distribution ¢ =f (x,y) 
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As x increases from — © to + © z varies from zero to unity and 








dz 2 1 
edz = (l+e"%)—; e= , 
ry l-—-zl+e” 
Consequently 
1 z ty dz 
G(ts| y) = 20 + en f ( ) a. 
o \l—z 2? 
1 
= rau)" f zttl(] — 2)-"dz. 
0 
Therefore 
G(ts| y) = Fo(y)*T2 + 4)rd — ty). (4.3) 


The logarithm of the conditional moment generating function is 
Ig G(ts| y) = tilg Foy) +g QQ +4) +lgrd+4)+igrd—t). 4.4) 


The first derivative 


1 
lg’ G(t | y) = lg Fo(y) + lth + lg’ TU + &) - Ig’ TUL — ty) 
1 
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Equi-Density Curves for Bivariate Normal Distribution c =f (x,y) 
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GRAPH 4 


leads for t;=0 to the conditional expectations 
E@|y)=1+lgFxy); Eyl 2) = 1+ 1g Fi). (4.5) 
Some values of the regression curves (4.5) are 
y=— 2», E@@|-—~)=—-0©; zr=—«%, Ey| —-*)=-« 
y = 0, E(x| 0) = .30685; z=0, E(y| 0) = .30685 
y= @, E(z| ©) =1; L= O, E(y| ©) =1. 
The regression curves are traced in Graph 5. The curves start at x= —~<, 
y=-—, and do not intersect at s=0, y=0, but at a value where 
1—Ig(l+e%) =y 
i.e. at the solution of 
e~¥ = 1/(e — 1). 
Thus they intersect at the point 
z=y = .5413 (4.6) 


of the diagonal. The conditional expectation of one variable is an increasing 
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function of the other one, but in contrast to the normal case they become 
asymptotically parallel to the axis at a distance equal to unity. 
The second derivative of (4.4) leads for t=0 to the conditional variances 
o*(x| y) = — 1+ °/3 = o%(y| 2). (4.7) 
The conditional standard deviations are thus 


o(x| y) = o(y| x) = 1.13573 < o, = oy = 1.81380. (4.8) 


As for the usual bivariate normal distribution the conditional variances of 
the bivariate logistic distribution (2.5) are constant and smaller than the un- 
conditional values. 

To obtain the coefficient of correlation for the bivariate logistic distribution 
consider the bivariate moment generating function. From (1.12) and (4.3) 

e~ututs 


+e 
Gt, 4) = T2+4)ra — ty) f ences (4.9) 


The integral becomes by the transformation, 1+e-”=F-", 


1—F\-s dF 
f pen ( ) _ -f Putte] « F)-“dF. 
0 F F(i — F) 0 
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Consequently the bivariate generating function is from (4.9) 

G(ty, t) = TU — t) rd +t, +4)r(l — &). (4.10) 

If we put 4.=0 or :=0 we obtain the univariate moment generating func- 

tion (2.3). Two differentiations of the logarithm of the bivariate moment gen- 


erating function with respect to ¢, and ¢, lead for t,=0, t2=0 to the expectation 
of the cross product 


PlgTdttith 
E(zy) = yates Is eth “ 4) a 


(4.11) 
dt,dt, 


Since the marginal expeetations are zero and the marginal standard deviations 
in (2.2) are r/+/3 the coefficient of correlation p has the fixed value 


xr? 3 1 
et a tax te (4.12) 


The correlation ratios (1.13) are from (4.5) obtained as 


3 1 
n?(x | y) = =f (1 + lg F(x))*dF = n°(y | x). 
us 0 


1 
f (lg x)*dx = (—1)*k! k>1 
0 


the correlation ratios 


V3 
n(z| y) = n(y| x) = o (4.13) 


are the reciprocals of the marginal standard deviations, a relation which has 
no analogy in the normal distribution. 

Since we know the marginal standard deviation and the coefficient of correla- 
tion we may compare the bivariate density function (3.1) to the normal one 
with mean zero, standard deviation x/+/3 and coefficient of correlation p= 1/2. 
Then the usual bivariate density function is 


= 2 7 
f(z, y) = (V3/x*) exp |-= (x? — zy + y’) |- (4.14) 


It has a maximum at z=0, y=0 and its value is 
f0, 0) = V3 x-* = .055861 


less than the value 2/27, obtained for the logistic distribution. The curves 
corresponding to the constant densities c=.01,(.01).05 are obtained as the 
solutions z, y of 
rr V/3 
v2—-sza+y = —l—- (4.15) 
2 rc 








BIVARIATE LOGISTIC DISTRIBUTIONS 
This quadratic equation was solved for 
z = 0; z= Yy; z= 2y; z= y/2; r=—y. 


The points thus obtained were used to trace the normal density curves in 
Graph 4 which by their symmetry clearly differ from the asymmetry of the 
bivariate logistic distribution. 

The normal regression curves corresponding to p=} namely 


E@|y=—; By|2) =— (4.16) 
are traced in Graph 5. The conditional normal standard deviations 
o(x| y) = o(y| 2) = 1.57079 < 0, = oy = 1.81380 (4.17) 
are larger than those for the logistic distribution given in (4.8). 
5. DISTRIBUTIONS OF THE EXTREME VALUES 


In a sequence of n observations on the vector x, y with the bivariate logistic 
distribution, let X, and Y, be the largest observations from the first and 
second component. The joint probability function of (X,, Y,) is 


F(x, y) = (l+e*+e%)™. (5.1) 


It is known [5] that the most probable largest values u, and v, in the univari- 
ate case are 


Un = VU, = Ign. 
Therefore the bivariate probability function for the reduced largest values 


x — Ign, y—lIgn 


e-7 oe ee“ 
F-(24+lgn,y+1gn) = (i+! =") : (5.2) 
n 


If n increases the asymptotic bivariate probability function ®(z, y) defined by 
&(r, y) = lim F(x + Ign, y + Ign) (5.3) 


n> 2 


becomes, from (5.2), 
-Y 


@(x, y) = em" -e-* (5.4) 


Since 
@(r) = e-*” (5.5) 


is the first asymptotic probability function of the largest values, the joint dis- 
tribution of the largest values splits into the product of the univariate distribu- 
tions 


P(x, y) = B(x) H(y) (5.6) 
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and the largest values are asymptotically independent although the variables 
are dependent. J. Geffroy [3] has shown that this also holds for the normal 
distribution. 

Consider now the joint limiting probability function of minima in n observa- 
tions. The probability that the smallest observation on the first component 
will be greater than xz and the smallest observation on the second component 
will be greater than y is from (1.7), (2.1) and (2.5) 


l ] 1 , 
P*(z, y) = E —- ———_ - + | ; (5.7) 
1+e7 }+e% l+e7+e% 
Introducing the most probable smallest values, —lg n, and passing to the 
limit, we define II(z, y) as 





I(x, y) = lim P*(x — Ign, y — Ig n). (5.8) 


The right side is from (5.7) 

1 1 1 
lim E - _ + sre 
a2 1 + ne" l+ne" 14+ n(e* + e”) 
] 1 








= lim |1— 
n+ © n 


If we pass to the limit 
I(x, y) = exp {—[e* + ev — (e* +e) }}. (5.9) 
Now 
I(x) = exp [—e?) (5.10) 


is the first asymptotic probability function of smallest values to exceed x. Thus 
(5.9) becomes 


M(x, y) = M(x)M(y) exp [—(e* + e~*)-"]. (5.11) 
The third factor does not exist in the probability function of the smallest 


values, taken from the normal bivariate distribution. 
The asymptotic probability function (5.11) is stable. The relation 


II"(x — Ign, y — lg n) = I(a, y). (5.12) 


holds of course for the first two factors in (5.11) but it also holds for the third 
factor since 





exp | : ] = exp [(e-* + e~)-"]. 


e~ttle n os e~utle n 


Thus (5.11) is an asymptotic stable bivariate probability function of smallest 
values where the marginal functions are the stable asymptotic probability 
functions of smallest values of the first type. In contrast to the behavior of the 
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largest values the asymptotic bivariate distribution of the smallest values does 
not split into the product of the asymptotic univariate distributions of small- 
est values. The largest values become asymptotically independent but the 
smallest values remain dependent. This is due to the asymmetry of the bivari- 
ate logistic distribution (2.5). 


6. OTHER BIVARIATE LOGISTIC DISTRIBUTIONS 


Evidently the bivariate logistic distribution analyzed up to now is but one 
example of such a function. Other bivariate distributions with logistic margins 
are obtained from systems of bivariate distributions with given margins [4, 6]. 
Two such systems where the marginal distributions are equal are 


F(z, y) = F(x)F(y)[1 + a(1 — F(z))(1 — FY))]; -1S¢s1 6.1) 

and 
[—lg F(z, y)]" = [—lg F@]"+ [-le Fy)"; m21 (6.2) 
The cases a=0 in (6.1) and m=1 in (6.2) lead to independence of the two vari- 


ables. If F(x) and F(y) in (6.1) are replaced by (2.1) and the corresponding ex- 
pression in y, a bivariate logistic symmetrical probability function 


F(x, y) = (1 + e*)-"(1 + e%) [1 + ce-*-*(1 4+ e*)-1(1 + e-)-] (6.3) 


is obtained. In contrast to the distribution considered previously this formula 
represents a parametric family of distributions, where the correlation coeff- 
cient is a function of a. The variables z and y are again written in reduced form. 
Equation (2.6) holds as previously while (2.7) becomes 


1 | 
F(0, 0) = =f + <| = P(0, 0). (6.4) 


Both probabilities increase with increasing values of a. Both the upper left 
and the lower right quadrant have the probability 4(1—a/4). The density 
function is 


f(a, y) = Sites E Pan Ao =)(- - | (6.5) 
Gy = (1 + e-*)2(1 +e)? «(; t+e)\1i+e] ; 


Evidently the equations (1.6) hold. The maximum located at +=0, y=0 is 
f, 0) = 1/16 (6.6) 








independent of a. 
The regression curves 


E(z|y) = a(2F(y)- 1); E(y| x) = a(2F(z) — 1) (6.7) 


are linear in probability scale, intersect at the common mean z=y, y=0 and 
vary from —a to +a. The coefficient of correlation is linked to the parameter 
a by 


p = 3a/x* : (6.8) 
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and is limited by 
|p| < .30396 (6.9) 


while the two correlation ratios are a/7. 

The asymptotic bivariate distributions of extreme values split into the 
product of the univariate distributions of the extremes and are of the first type 
(5.5). Thus the extremes of the variables become asymptotically independent 
although the variables themselves are dependent for a0. 

The system (6.2) leads to the family of bivariate logistic distributions 


F(z, y) = exp {—[(g (1 + e*))™ + (ig(1 + e7%))™]}. (6.10) 


When m=1, the joint probability function is the product of the two marginal 
functions. As m tends to infinity 

lim F(x, y) = min[(1 + e~*)~, (1 + e-¥)—] (6.11) 
represents a degenerate case first studied in a general form by Fréchet [2]. 
The expression (6.10) is so involved that it can hardly be used for practi- 
cal purposes. However, the system (6.2) is interesting for bivariate extreme 
values because the bivariate asymptotic probability functions (2, y) [and 
II(z, y)] are obtained by the introduction of the asymptotic probability func- 
tions (x), 6(y) [and I(x), M(y)] and do not split. 


CONCLUSION 


Although the logistic distribution closely resembles the normal one by its 
symmetry the bivariate logistic distribution (2.5) differs considerably from 
the usual bivariate normal one because it is asymmetric. The probability at 
the center is F(0, 0) =4 instead of 4. The curves of equal density differ from 
the normal ellipses. The densities in the lower left quadrant are higher than 
those in the upper right one. The regression curves clearly depart from the nor- 
mal linearity. They intersect at r=y=.5413, and become asymptotic to the 
axis at a distance equal to unity. The conditional variances are constant and 
smaller than the unconditional ones. The correlation ratios are the reciprocals 
of the marginal standard deviations and the coefficient of correlation is con- 
stant and equal to 4. Finally as for the normal distribution, the largest values 
are asymptotically independent, although the original variables are depend- 
ent, but in contrast to the normal distribution this does not hold for the small- 
est values. 

The probability function (2.5) can be used in cases where the marginal dis- 
tributions are symmetrical and resemble the normal one if the sample coefficient 
of correlation is of the order }. 

The bivariate logistic distribution (6.3) is more flexible than (2.5) because it 
contains a parameter a. Independence enters as a special case a=0. This bi- 
variate distribution is symmetrical but the contents of the quadrant depend 
upon a. The maximum of the density is independent of a. The regression curves 
are linear in probability scale and intersect at the common mean. The coeffi- 
cient of correlation is a linear function of the parameter a but remains within the 
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limit +.30396. In this case the extremes become asymptotically independent 
and their distribution is again of the first type. 
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UNBIASED COMPONENTWISE RATIO ESTIMATION! 


D. 8. Rosson anv C, VITHAYASAI 
Cornell University 


If the variates z and y are linearly related by a regression through the 
origin and if the population mean X for the variate z is known, then a 
ratio estimator such as 9X /2 is generally more efficient than the sample 
mean 9 as an estimator of the population mean Y. The efficiency of 
ratio type estimation can sometimes be improved if the correlated 
variates z and y can be expressed as the sum of k corresponding com- 
ponents, z=2,+ +++ +z, and y=yi+ -++ +ys. When the individual 
components 2; and y; are more highly correlated than z and y, a com- 
ponentwise ratio estimator such as £9;X;/2; is generally more efficient 
than 29,2X,;/22;=9X/zZ. This increased efficiency is retained when 
such estimators are adjusted to eliminate bias. 


1, INTRODUCTION 


ATIO-TYPE methods of estimation of a population mean become efficient 
R when the variate y of interest is correlated with a more easily measured 
variate x. In such cases both x and y are measured in a bivariate sample and 
x alone is measured in an additional sample which, in the case of finite popula- 
tions, may include the entire population. The regression function E(y| zx) is 
estimated from the bivariate sample and evaluated at the mean value of x 
in the larger sample to obtain an estimate of the population mean value of y. 

When the regression function is approximated by a straight line through the 
origin, E (y| x) =6z, then for a bivariate sample of size n the estimator of 8 
takes the form of a ratio 


n n 
A —(r—1) —(r—2) 
B= ee Xi Yi 2, zi 

i=l im 


where r is chosen so that the conditional variance of y for any given z is pro- 
portional to 2’, 


var(y| x) = ez". 


The ratio estimator of the population mean Y then becomes 


where X is the estimator of the population mean X obtained from the larger 
sample on z. 

The precision of ratio-type methods can sometimes be substantially im- 
proved if the correlated variates x and y can be expressed as the sum of k more 
highly correlated components; that is, if r=2,+ +--+ +a, and y=y+--- 
+yse, where the corresponding components z; and y; are more highly correlated 
than z and y. Estimation of Y by separate components, 6,X:+ - - - +8.X:, 
in this case is generally more accurate than 8X. An empirical example of this 
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arises in the ratio estimation of total oven-dry matter yield of silage corn in 
field plots for which total green weight yield is measured; when both green 
weight x and dry weight y are measured separately for ears and the vegetative 
parts of the plants in a sample of corn hills, the componentwise estimator in- 
creases efficiency by approximately 70%. An example from general sample sur- 
vey methodology is the case of cluster sampling with post stratification. Here 
x represents the number of elements in a cluster and y the cluster total for 
some measured character; the total number X of elements in the entire (finite) 
population is assumed known and the object is to estimate the population 
total Y for the variate y. If the x elements in a randomly chosen cluster are 
subsequently partitioned into k strata of size x, - - - , 2, for which the popula- 
tion totals X,,---, X,% are known then a stratified or componentwise esti- 
mator of the type 


fe tg ee 
ry Tr 


may be considerably more efficient than the corresponding non-stratified 
estimator 


lee 
at **-th 





Kits. +X) = =X, 


Ratio estimators of the type mentioned above are biased when the popula- 
tion size is finite; Hartley and Ross [2] succeeded, however, in constructing an 
unbiased ratio-type estimator for finite populations and Mickey [3] subse- 
quently developed a large class of such estimators. Here we are concerned 
primarily with the Hartley-Ross type of unbiased componentwise ratio esti- 
mator, for which we present exact variances and unbiased variance estimators. 
The efficiency of componentwise ratio estimation is then examined empirically 
with the data from 39 corn plots of 10 hills each, and the bias of the conven- 
tional ratio estimate and its variance formula are evaluated numerically. 


2. THE VARIANCE OF THE HARTLEY-ROSS TYPE OF COMPONENT-WISE 
RATIO ESTIMATOR 


The Hartley-Ross unbiased ratio estimator of the population total Y for a 

single component y takes the form 
oN 1) 
Y’ = X? + ————- (f — #?) 
n-1l 

where 7 is the mean value of r=y/z in a random sample of size n from a popu- 
lation of size N, and X is the population total for z. Goodman and Hartley 
[1, p. 495] gave the limiting form of the variance of this estimator as 


1 1 1 
lim Wy: var(Y’) = — [oy a Ke — 2Re.,+ ; (ete - cra) 
n 


N+ N? n= 


where R is the population mean of the ratio r=(y/z). 
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The exact variance for finite N was given by Robson [4, p. 518] in terms of 
multivariate polykays and may be expressed in the notation of Tukey’s sym- 
bolic, dot-multiplication [6, p. 49] as 


: N(N-n)f: -—— 2 pi 
var(Y’) = —— —lo,+ RR: R-o, — 2R-o2, 
n 


4 l CS 2 ‘2°28 )| 
grey N O;"*Oxz N Or,2*Or.2 


All variances and covariances appearing in this formula are understood to be 
defined in the usual manner for finite populations; for example 


Tr.z a; —NR xX . 
m -y4(2 ) 


i=1 





This definition, which arises naturally in the algebraic treatment of moments 
and cumulants of a finite population, also serves to illustrate what is meant by 
dot-multiplication, since 

oo 


Or.z 3 ras — Pa 


t=1 N(N rags ‘1) inj 
= RX — R-X 


thus, the dot-product of two means is the mean of all possible crossproducts. 
The same is true for the dot-product of more than two means; for example, 


sisal = R-(XY — X-F) 


-XY — R-X-Y 
I] 1 


gina > ritiYji — > PX iYr 


" NN-D & N(N —1)(N — 2) sine 





and, similarly, 


2 


R-R-o; 


= a rare 
~ N(N — 1)(N — 2) 2, as 
1 N 
‘nts: sraetaiiae eens PP LEX. 
N(N — 1)(N — 2)(N — 3) - om ie 





As N gets large, of course, the dot-product of two or more moments approaches 
the ordinary product of the moments, provided the latter approach a limit, and 
so the limiting var(Y’) of Goodman and Hartley is obtained. 

A minimum variance unbiased estimator of var(Y’) is easily constructed 
using the fact that polykays, or dot-products of sample cumulants, are mini- 
mum variance unbiased estimators of the corresponding polykays of the finite 
population. Thus, for example, the minimum variance unbiased estimator of 
R-oz,y is 
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T'Sry = 
1 


reece es 


— 1)(n - as 2) izjnk 


or, expressed in more convenient computational form, 


l 


a | (n 1) be won n Dai + Da Dow 


n(n — 1)(n — 2) 


n 2 n n n 
= ( ou) + Dad 2 r| . 
1 1 1 1 
The other components of var(Y’) are similarly estimated; computing formulas 
for the estimates are given by Robson [5, p. 273] and will follow as special 
cases of the more general formulas given next for the component-wise ratio 
estimator. 
The general case we wish to consider is 


k 

-Dvi=d[ xa 
i=] i=1 

where now 


ay, gos ¢ 1) +42>¥ cov(¥!, Y}). 


i=l i<j 


Since the individual terms var(Y/) take the form indicated earlier for a single 
component estimator, the only new algebraic problem is the computation of 
cov(Y/, Y/), and by the same methods used earlier this may be shown to take 
the analogous form 


. . N(N — n) aes ad 
cov(¥ ri Yj) = eT ee ee + R;: 4°Cs;0; Ri oz; y; = Rj y;,2; 
n 
I (1 1 PS ios ] 
P sy” 6 Gresty° Fag.23 x4 > Ory,23°Fx5,7;) | 
n—1 N N : g 


Computing formulas for the minimum variance unbiased estimators of the 
terms in this covariance formula are shown below for the case i=1, j=2; 
sample means are expressed in the manner indicated earlier as, for example, 


yiy2 ~y — yas 


i=l 


and all products represent ordinary products, as 


l n n 
Vi92 = —( ) wn)( 3 vn) . 
n 1 1 


In addition, the abbreviation (n)» is used for n(n—1) - - - (n—m+1). Thus, 
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= [yiye + yry2|n*/(n)2 
= {n?[2(yrys = Lilie = X2yi"2) —(rn- 1) (ritiys + reyits) 
= 2) (xirarire) =~ (yrye + ritetyr)| + n®[(n aA 2) riretite + riZiys 
+ rl F2 + Pol yP Le + Toto + L122 Po] = nr rot yo} /(n)s 


- {n?[(n _ L)riniye = Yiy2 > Xaraya + ye] r n®ryxyy2}/(n)s 


= {n*[(n? — 8n + I)rretits + (n — 1) (iri) — Yry2 + 2ayare + izays 
reyide) + Yrs + Titatire] — n®[(n — 2)(riratate + riretite 
rliYe + Pilot + Poly Lo + Teter} + n‘rirotyte} /(n)4 

= {n2[(n? — 3n + 1)rytetire + (mn — 1)(Rirys — Yrye + Payite 


+ rie + Ley iF2) + Yye + riretsrs] = n'[(n 7 2) (ryxetir + T2017 2) 


+ Tilo + ek Le + Bia i2 + retey| + nr sFox12} /(n)4. 


Computing formulas for estimating the components of var (Y/) may be ob- 
tained from the above formulas by putting (rj, 21, y1) = (r2, %2, Ye) = (ri, Li, Yi)- 


3. AN EMPIRICAL EVALUATION OF COMPONENT-WISE RATIO ESTIMATION OF 
CORN PLOT TOTAL DRY WEIGHT 


Crop yield in agronomic experiments with silage corn is ordinarily measured 
in terms of total dry matter production per plot. Dry weight can be measured 
accurately only by drying the harvested plant material in ovens and there are, 
of course, distinct limitations on the amount of material which can be handled 
in this manner. Green, or fresh weight of the production from a plot, however, 
can be measured directly in the field as the material is harvested, and since 
green and dry weight are highly correlated the total dry weight for the plot 
can be accurately estimated by determining the dry matter percentage in a 
sample from the plot and applying this sample dry matter per cent to the 
measured total green weight. For the purpose of measuring the sampling error 
in this method of estimation, green and dry weight determinations were made 
on 390 individual hills of corn in an experiment containing an early, medium, 
and a late maturing variety arranged in plots of 10 hills.* These weight de- 
terminations were made separately for the ears and stovers of each hill (stovers 
=husks+stalks+leaves), thus providing an opportunity also to examine the 
efficiency of a component-wise estimator of plot total dry weight. The separate 
and combined components of hill green and dry weights are summarized 
graphically in Figure 1, showing that a somewhat higher green weight-dry 
weight correlation exists for the separate components than for the combined 
components. Average within-plot correlations between green and dry weight of 
ears, stovers, and ears+stovers were .953, .932, and .824, respectively. 

For each plot the efficiency of the unbiased component-wise ratio estimator 
Y’ = Vorover + Vier relative to the unbiased combined ratio estimator Y {rover+ear 





* Data provided by R. L. Cushing, formerly with the Department of Plant Breeding, Cornell University, now 
Director of Research, The Pineapple Research Institute of Hawaii, Honolulu, Hawaii. 
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was computed for samples of n hills, n=2, 3, - - - , 9. These efficiencies, in the 
form of a variance ratio var(Y/,,)/var(Y;+Y/), were relatively constant for 
all n; the average efficiencies over all 39 plots are shown in Table 1. The two 


TABLE 1. AVERAGE RELATIVE EFFICIENCY OF THE UNBIASED 
COMPONENT-WISE ESTIMATOR 








- 


n=4 n=5 n=6 n=7 





1.69 1.69 1.69 1.69 1.69 1.69 




















components of the estimator, Y{over and Y¢,,, were correlated in this experi- 
ment (Table 2), but to a much lesser degree than the green and dry weights 
within each component. The variances var(Y/,.), var(Y/+Y2), var(Y/), 


TABLE 2. AVERAGE CORRELATION BETWEEN THE TWO 
COMPONENTS OF THE ESTIMATOR 








n=5 | n=6 | n=7 














.241 | .242 | . 242 .242 





var(Y,) and cov(Y/, Y/) employed in Tables 1 and 2 were computed directly 
from the formulas given earlier. 

In addition to this evaluation, the data provided an opportunity to compare 
the sampling error of the unbiased ratio estimator with the error mean square 
of the more conventional, but biased, ratio estimator 7 = 9X/z. This was ac- 
complished by enumerating all possible samples of size n for each plot of N = 10 
hills, computing the conventional ratio estimate for each such sample, and then 
averaging the squared error (estimate-known plot dry weight)?, over all 
(") samples. Averaged over all 39 plots, the error mean squares (EMS) for 
the three estimators Y,,., Y,, &, compared to the variances of the corresponding 
unbiased ratio estimators as shown in Table 3. 


TABLE 3. COMPARISON OF ERROR MEAN SQUARES OF BIASED AND 
UNBIASED RATIO ESTIMATORS BASED ON A COMPLETE ENU- 
MERATION OF ALL POSSIBLE SAMPLES FROM 
39 CORN PLOTS OF 10 HILLS EACH 








n=2 n=3 n=4 n=5 n=6 n=7 n=8 n=9 





EMS(f,,.) | 45,603 | 26,102 | 16,665} — | 7,718 | 4,718 | 2,748 | 1,220 
var(Y?,.) 46,233 | 26,275 | 16,744 | 11,113 | 7,389 | 4,742 | 2,762 | 1,226 
EMS(¥,) 9,612 | 5,362 | 3,382] — |1,477| 945] 549| 243 
var(Y!) 9,466 | 5,317 | 3,374| 2,235 | 1,484] 952] 554] 246 
EMS(f,) 14,222} 8,102] 5,169] — | 2,286| 1,468] 856] 380 
var(Y/) 14,458 | 8,197 | 5,217 | 3,461 | 2,300] 1,476] 860] 382 





























The bias of the conventional ratio estimator is negligible in this case, even for 
small samples. The largest bias observed in all 39 plots was for samples of size 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1961 


VARIETY A 


DRY WEIGHT IN GRAMS 
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VARIETY B 


ORY WEIGHT IN GRAMS 
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n=1 from the plot having the lowest total production; the total dry matter 
yield of this plot was 3568 grams while the average value of the conventional 
ratio estimator for the 10 possible samples of size n = 1 was 3682.5 grams, show- 
ing a bias of 3.2%. For all 39 plots the average total yield per plot was 4255.00 
grams and the average value of the conventional ratio estimator for samples 
of size n=1 was 4,254.30. The bias steadily decrease as sample size increased, 
and at n=9 the average estimate was 4,255.04. In this particular application, 
then, there is a negligible difference in both mean and variance of the biased 
and unbiased estimators, and in practice the conventional, biased etsimator 
offers the labor saving advantage that individual hills in the sample need not 
be weighed ard dried separately but may be handled in bulk. 

Finally, the actual error mean square of the biased estimator f = 9X /Z can 
be compared to the variance approximation. 


2 2 
. es N(N _ n) Ty Cz 2o2,y 
var(Y) = _— r| = +3, - oe l- 


This comparison is shown graphically in Figure 2. A tendency for this approxi- 
mation to underestimate the true error mean square decreases as sample size 
increases since the actual error mean square decreases at a faster rate than the 
function (V—n)/n. The same is true when the variance approximation is com- 
pared to the true variance of the unbiased estimator. Figure 3 shows this result 
for the component-wise estimator and for the two components separately ; here, 


VARIETY C 


ORY WEIGHT IN GRAMS 
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Fic. 1. Green and dry weight per hill of corn plants for three 
varieties planted in a uniformity trial. 
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~— Error Meen Squore of Biased Ratio Estimator 
--- ai vorionce App 





—— Exoct Variance of Unbiased Ratio Estimator 
—-- Conventional Vorionce Approximation 
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SAMPLE SIZE SAMPLE SIZE 


Fia. 2. A comparison of the error mean Fie. 3. A comparison of the exact vari- 
square of the ratio of means estimator and ance of unbiased ratio estimator and the 
the variance computed by the standard ap- variance computed by the standard approx- 
proximation for the variance of a ratio. imation for the variance of a ratio. 


the unbiased component Y rover is denoted simply by Y/ the component 
Year by V/ and the sum ¥’+V’ by W”’. 
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A NOTE ON CURVE FITTING WITH MINIMUM DEVIATIONS 
BY LINEAR PROGRAMMING 


Water D. FisHer 
Kansas State University 


For some time it has been well known among specialists in mathe- 
matical programming that the statistical problem of fitting a linear 
multiple regression with the criterion of minimizing the sum of absolute 
deviations from the regression fuaction (rather than squared devia- 
tions) may be reduced to a linear programming problem. But this 
knowledge seems not widespread among general statisticians.' This 
note briefly reviews the formulation of this application of linear pro- 
gramming, and its history. 


1. FORMULATION 


— that it is desired to fit the linear function 


£, = a+ Diete + +++ + Dure 
to the observations 
Fis ru) 


+ + Lop | 


‘ * Ink) 


the element x,; representing observation 7 on variable j, and n being larger than 
k. Assume that the fitting is to be accomnlished so that the sum of the abso- 
lute deviations 


" 
S=)> | ru — fa| 
ra 


is minimized. 
Let the parameters in equation (1) be now expressed as 


a 
Dis 
(2) 
bu = — 2k 


where the y’s and the z’s are 2k non-negative variables to be determined. Let 
the residual for observation 7 be expressed as 


fin — Fa = Ui — 1 @G@=1---n), (3) 





1 In a recent article in this Journal, for example, Karst (5) has proposed a method of minimizing absolute 
deviations in a linear regression problem with only one independent variable, which is essentially a repetition of the 
tnethod described in 1930 by Rhodes ([6], p. 978). 
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where the u’s and v’s are 2n non-negative variables to be determined. By sub- 
stituting (1) and (2) into (3), the system of n linear equations 


— V1 + V1 oe + i242 — L1222 te + Tike _ Tirzk = Vi 


Uz — Ve + Yi — 21 + Loo — Lo22%2 se 2 Loy — Loree = 21 


Un — Un t+ Yr — 21 + Ln2Y2 — Taoz2 + ++ + + LnkYe — Taz = Tar 


is obtained. The problem then is to find non-negative values of the w’s, v’s, y’s 
and 2’s that satisfy system (4) and that will also minimize the linear form 


n 
R= DY (ui t+ v9). (5) 
t=1 
This is a linear programming problem in 2(n+k) variables and n constraints. 
The simplex method of Dantzig [2] may be applied directly, and the solution 
always obtained. For automatic computation a suitable unit basis for the first 
stage can be formed from the set of u’s and v’s. At the minimum, R=S. 

The following properties of the solution may be easily verified. For each i, 
either u,; or v, is zero, and for each j either y; or z; is zero.* For at least k of the 
observations, both u,; and v; are zero (the hyperplane passes through at least /: 
points).* The solution may not be unique—as is the case when a single median 
is to be fitted to an even number of observations. 

If it is desired to constrain the regression function (1) by imposing additional 
side conditions such as, for example, the requirement that the hyperplane 
pass through certain specified points, this can be done by adding these condi- 
tions as additional equations to the system of constraints (4). If it is desired to 
specify that certain parameters of (1) be non-negative, this could be most 
easily done by using the original parameter as a variable in the linear program- 
ming problem without conversion to a difference since the variables of.a linear 
programming problem are, by definition, non-negative. For example, if bi: is 
not to go negative, we could delete the second equation of (2) and use by 
directly in (4), rather than y.—2. If the desired regression function is nonlin- 
ear, but additive in the sense that it can be written in the form 


1= f2(x2) t**: + fi(rx), 


2 If both u; and »; were positive, R could be reduced and the constraints still satisfied by subtracting the same 
positive constant from both uj and vj. Strictly speaking, both yj and z; could be positive (an infinity of solutions 
being obtainable by adding a series of identical positive constants to both yj; and z;), but it is convenient to charac- 
terize the solution by specifying the smaller of y; or z; to be zero for all j, and this occasions no loss of generality to 
the curve-fitting problem. The simplex method provides this characterization because of the requirement that at 
each stage a linearly independent basis be selected. (If both uj and vj, or both yj and z; were non-zero, the basis 
vectors so formed would be linearly dependent—cef. [1], p. 141.) 

+ A solution always exists with this property, but the property is a necessary condition only if the solution is 
unique. Proof of these assertions can be obtained as follows. If a solution hyperplane is claimed that does not 
pass through & data points, it can be shown (by methods analogous to those used by Rhodes [6], p. 977 ff. and 
Karst [5], p. 118 ff.) that this hyperplane can be displaced or rotated by a small amount so that it does pass through 
k data points and has no higher an R value than before. Therefore a solution hyperplane can be found that goes 
through at least k data points. Such a hyperplane corresponds with an extreme point of the constraint set of the 
linear programming problem (in the space of the 2n +2k variables of that problem), provided that we bar the possi- 
bility discussed in the preceding footnote that for some j yj and 2; are both positive. Moreover, a well known the- 
orem of linear programming stipulates that if the solution is unique, it must occur at such an extreme point. There- 
fore, if the solution hyperplane to the curve-fitting problem is unique, it must pass through at least & data points. 
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it can be transformed into a linear function by making transformations of the 
independent variables, as is often done in least squares regression. If it is de- 
sired to weight the observed data unequally in the fitting, the desired weights 
should be inserted as coefficients in the objective function (5), rather than unit 
weights. 

From the foregoing it is apparent that a flexible and useful method is avail- 
able for fitting a linear regression function when the sum of absolute values of 
deviations is to be minimized. When automatic computers are available, the 
method is adaptable to large numbers of variables. For brevity, and for con- 
trast with least-squares methods, the method described may be called the 
method of least lines. 


2. HISTORY 


The least lines problem was recognized by Fourier in the 1820’s as one that 
could be handled by an iterative procedure he proposed that is similar to the 
simplex method.* Edgeworth [3] and Rhodes {6] pointed out that the circum- 
stances needed to justify the method of least squares as an optimal method— 
notably random sampling and normal distribution'—often do not exist, and 
that in other circumstances least squares may give undue weight to extreme 
observations. 

The methods proposed by Edgeworth® applied to only two variables. The 
methods of Rhodes [6] and Singleton [7], while extending the proposals of 
Edgeworth to more than two dimensions, become extremely unwieldy as the 
number of dimensions increases. In 1955, Charnes, Cooper, and Ferguson [1 | 
showed that a certain management problem involving a minimization of abso- 
lute values could be transformed to standard linear programming form by 
employing the device of representing a deviation as the difference between 
two non-negative variables. 

The method of least lines as a type of multiple regression has been suggested 
by Reinfeld and Vogel [8], and used by Arrow and Hoffenberg [9]. Undoubt- 
edly there have been other uses not known to the author. 
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PARTIAL CORRELATIONS IN REGRESSION COMPUTATIONS 


Rosert L. GustTarson 
Michigan State University 


The partial correlation between the dependent variable and the 
k-th independent variable in a regression can be calculated simply from 
the k-th regression coefficient and its standard error. Partial correla- 
tions are useful in determining relations among the estimates obtained 
under alternative assumptions about which variables in a regression 
are independent of the disturbances. 


His note describes some simple algebraic relationships which have been 
found useful in applications of least squares regression analysis. The rela- 
tionships pertain to: 
1) Computation of partial correlation coefficients. 
2) Use of partial correlation coefficients in computations required to change 
the choice of which variable in a regression should be “dependent.” 
1. NOTATION 
Let 


Xi = Bo + B2X2 + BsX3+ +--+ BpXpe+ mi (1) 
with the disturbance y; assumed to be independent of X2, X3, - - - , Xp; and let 
mi = SX,X; — (SX ;)(SX,)/N, i,j = 1, 2, as Pit 


where S denotes summation over the sample observations, and N is the num- 
ber of observations. Using matrix-vector notation for convenience, let 

M= ((mis)), i,j ” 2, 3, ae, e P; 

m, = ((mi)), i= 2,3,--:,P; 

C=M"=(()), 1,7 =2,3,---,P. 
Then the vector of least squares estimates of 2, 83, ---, Bp is b=Cm,, with 
elements be, b;,-- +, bp. The estimated variance of wy; is s{=(my—b’m)/n, 
where n is the number of degrees of freedom (n=N—P). And the matrix of 
estimated variances and covariances of the elements of 6 is 8,°C. 

2. PARTIAL CORRELATION COEFFICIENTS 


For k=2, 3,---, P, let ru. be the sample highest order partial correlation 
between X, and X;,. That is, 


rive = (Ri — Rin d/(1 — Rind (2) 


where R?=b’m,/m, is the multiple coefficient of determination obtained in 
estimating Equation (1), and Rj, is the multiple coefficient of determination 
that would be obtained if X, were omitted from the regression.* 





1 Tt can easily be shown that (2) follows from other standard definitions of the partial correlation coefficient. 
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To compute rj,., a simple formula is 


riz. = ba/(be + n8p,) (3) 


where b, is the estimate of & and s, is the computed standard error of by. 
In cases where the ratios ¢, = b;/s,, are obtained, (3) can be conveniently writ- 
ten as 


rine = te/(te +). (4) 


Relations such as (3) and (4) are somewhat widely known by “oral tradi- 
tion,”* but they apparently have been generally unnoticed by writers on com- 
putation procedures.* Their validity, however, follows easily from (2) and the 
following additional relations: 

2 2 2 2. 
Sp, = Crr81; 8) = mM(1 — R)/n; 
and 


mu(Ri - Ried - bi/ Cre. 


3. CHANGING THE CHOICE OF WHICH VARIABLE IS DEPENDENT 


For k=2, 3,---, P, let h be the vector of least squares estimates of the 
coefficients @;, - - + , Acs, Ak41, - - - , @p in the alternative specification 


Xe = 09 + OX + es + OX + Ong Negi + +> + OpXp + ur (6) 


where the disturbance uz is assumed to be independent of Xi,---, Xiu, 
Xivs,* ++, Xp; and let A=((mj;)), me=((mu)), G=A-'=((gi;)), where the 
element indexes 1, j take the values 1, -- - ,k—i, k+1, -- - , P in each matrix 
or vector. 


The regression for Equation (6) (i.e., elements of h, G, etc.) can be obtained 
from the results of computations for Equation (1) (i.e., b, C, etc.), by making 
use of the squared partial correlation coefficient r7,., as follows. Let 

2 
ge = (1 — rux.)/Car. 


Then the elements of h=Gm, are® 


hy = rin. /be; 


| a 
h; — hibi — Ql, t=2,---,k-1,k+1,---,P. 


The estimated variance of the disturbance px, s¢= (mix—h’m,)/n, is 


& = qr /N. (8) 


2 They were first brought to my attention some years ago by Yehuda Grunfeld. 

* Other ways of getting partial correlations when the regression is done by the standard “C-Method,” all of 
which require comparatively extensive computations, are those described by, for example: Ezekiel and Fox [2, p. 
503), Fisher [3, Sec. 32], Foote [4, p. 142, second paragraph], Kendall [6, pp. 370-2 and 374-5], and Walker and 
Lev [9, pp. 342-3]. 

‘ The crucial relation is (5), which is given by, for example: Fisher [3, Sec. 29.1], Kendall (7, p. 169], and 
Schultz [8, pp. 743-5]. For a sketch of an alternative proof of (3), see Appendix 1. 

& The expression for ji is well-known. 
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The multiple coefficient of determination, R{=h’mi/mix, is 


Ri = 1 — ae/mu. (9) 


And the matrix of estimated variances and covariances of the elements of h is 
siG, where the elements of G are 


gir = Aicen/de; 
gir = hic — girdi, =: ->+,k—1,k+1,---,P;} (10) 


Ji = Ci —Gabj then, 9 - ge =~ CR take 


The computations may be checked by the requirements that h’m,+ ns, = mks, 
and Gm=h or GA=I. 
APPENDIX 1 


Although it does not follow strictly from the relations given in Section 2, it 
is also true of course that 


rip. = be/(Oe +085.) = h/t tn). 


Let yx be the vector of residuals from the regression of X; on Xe, X3,---, Xp; 
v, be the vector of residuals from the regression of X,; on Xo, X3,---, Xs-1, 
Xiu, + *, Xp; and v, be the vector of residuals from the regression of X; on 
Xe, Xs, °° +, Xena, Xeys,* ++, Xp; so that, by a standard definition of the 
partial correlation, 
= v1 vy / (vy ve v_)*!?. 

It can be shown that 

= be / Cer} 

, s) 
= Uy Us t+ Oy/ Cur; 


and 
= 1/Ccix. 


Hence, 
2.1/2 


rr. = by/ (be + Cf Ur) = by/ (de + ns) . 


APPENDIX 2 


Equations (7)-(10) can be derived by getting @ from C by applying pro- 
cedures (see Cochran [1] or Kendall [7, pp. 167-70]) for deleting an inde- 
pendent variable (X;) from, and then adding an independent variable (X,) 
to, a regression. They may be verified more easily, however, by making use of 
the matrix D of order P, defined as 


D= ((dis)) ” ((mi3))7, i,j a 1, 2, Pe Teer iy P,; 


All the relevant quantities can be expressed in terms of elements of D (see 
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Ezekiel and Fox [2, pp. 507-25], Foote [4], Friedman and Foote [5], or 
Schultz [8]). That is: 


bs = — da/dy, p= 2,3,--> ; (11) 
ti = dj; — diy /di, Ly he ; (12) 


8; = 1/(ndu); (13) 
and, for k=2,3,---,P: 
rik. = di/ (did); qe = 1/dix; 
hy = — du/du, t=1,:--,k-1,k+1,--- 


95 = dij —dada/du, t,7=1,:--,kK-1,k+1,--- 

s, = 1/(ndu). 
To verify, for example, the expression for h; ({=2,-- +, k—1, k+2,-- - 
in (7), write: 


—hyb; — Qk ik ~~ (diz/dex) (dir/di1) > (1/dex) (dix = dixdus/di1) 
— di/dix = hj. 


Regression computations can be done, of course, using D itself, but this has 
some disadvantages for general use, in particular: 

1) the matrix inversion computation is larger; 

2) the matrix C (or its equivalent) has to be computed, in any case, in order 
to obtain estimated variances and covariances of the estimated regression 
coefficients ;* and 

3) the “C-Method” is clearly preferable when regressions are done using 
alternative dependent variables with the same set of independent variables. 

The principal advantage in using D computationally is its “symmetry” with 
respect to the choice of which variable (among the given set X;, X2, --- , Xp) 
should be taken as dependent; and this advantage is considerably lessened by 
the availability, when needed, of Equations (7)—(10).? 

It may be noted, incidentally, that should one wish to obtain D explicitly, 
after completing computations for Equation (1) by the C-Method, this can 
be done from Equations (13), (11) and (12), in that order. 
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FACTORIAL TREATMENTS IN RECTANGULAR 
LATTICE DESIGNS* 


Leroy STaniey BrReNNAft AND CLypE YouNG KRAMER 
Virginia Agricultural Experiment Station 


This paper reports how factorial treatment combinations may be 
used in rectangular lattice designs. The analysis of variance is obtained 
both with and without recovery of inter-block information. Explicit 
formulae are presented for the variances and covariances of the treat- 
ment effects as well as single degree-of-freedom comparisons. In fact, 
this paper extends all the uses of factorial treatment combinations in 
complete block designs to rectangular lattice designs. 


1, INTRODUCTION AND SUMMARY 

ATTICE designs discussed by Yates [18, 19], Harshbarger [6, 7, 8, 9, 10] and 
L other writers have proved to be useful in many experimental situations. 
Since the incorporation of factorials by Cornish [4], Rao and Nair [14], 
Harshbarger [11], Kramer and Bradley [12, 13], Zelen [20], and Walpole [17] 
increased the usefulness of other types of incomplete block designs, the useful- 
ness of rectangular lattice designs should be increased by incorporating factorial 
treatment combinations in them. 

This paper obtains both the intra-block analysis as well as the analysis with 
recovery of inter-block information for factorials in rectangular lattices having 
nk(2<n<k) treatments with (k—1) replications with gq repetitions of each of 
the basic designs. The adjusted sum of squares for treatments is determined as 
a function of the treatment estimators which facilitates the incorporation of 
factorial effects. 

Independent sums of squares for a basic two-factor factorial are obtained 
and tests of significance are presented. The variances and covariances of the 
estimators of factorial main effects and interactions are given as well as a meth- 
od for determining single degree of freedom comparisons. By an appropriate 
selection of single degree of freedom comparisons multi-factor factorials. may 
be incorporated in rectangular lattices. 

2.1 RECTANGULAR LATTICE DESIGNS WITH (k—1) REPLICATES 

If orthogonal latin squares are used for determining the treatment assign- 
ments to the blocks in a rectangular lattice, then independent sums of squares 
for factorial effects may be obtained. If we consider a set of (k—1) mutually 


orthogonal (k Xk) latin squares where the first column is in standard order, the 
relationships that exist for the rectangular designs are: 


nk =number of treatment combinations, 
n= number of treatments in each block, 





* A condensation of part of a dissertation by L. 8. Brenna, written under the direction of C. Y. Kramer and 
submitted to the Virginia Polytechnic Institute in partial fulfillment of the requirements for the Ph.D. degree in 
statistics. 

t Now with Texaco, Inc., New York City, New York. 
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k=number of blocks in each replicate, 
and 
(k—1)=number of replications. 


The method of constructing this class of design may be summarized as 
follows: 


. Write down a set of (k—1) mutually orthogonal (kX) latin squares in 
standard order. 
2. Delete the first column of each of these (k — 1) squares. 
3. Assign treatment combinations V;; to each of the (k—1) rectangular 
arrays where (ij) corresponds to row and column, respectively. 
. Choose the first n(2<n<k) columns in each block, thereby forming 
(k—1) replicates of k blocks each. 
. Within each of the n-columned rectangular arrays, assign to blocks those 
treatments which have a common letter. 


2.2 INTRA-BLOCK ANALYSIS OF A RECTANGULAR LATTICE 


The intra-block model for a rectangular lattice repeated q times is 


Yijoot = Bt Tig + Beat + Pos TH éiizos, 1,8, = 
f= 
g = 
j= 
where ¥/;j.97 is the observation on the ijth treatment in the sth block of the gth 
replicate of the fth repetition if the ijth treatment occurs in the sth block of the 
gth replicate of the fth repetition, u is the over-all mean, 7;; is the 7jth treatment 
effect, 8.7 is the effect of block s in the gth replication of the fth repetition, 
por is the effect of the gth replication in the fth repetition, and €;j.9¢ are indepen- 
dent normal variates with zero means and homogeneous variances, o*. Restric- 
tions on the parameters in (2.2.1) are 
SX =0, LB =0, and YYas=0. (2.2.2) 
e's 8 o 
Note that only certain combinations of values of 7, j, and s are possible and these 
are determined by the nature of the experimental design. 
If 7';; denotes the total of the observations receiving the ijth treatment and 
B,;. the sum of the totals of blocks containing the 7jth treatment, then the nor- 
mal equations for this set of designs yield 


T ij — Bij.in = Unk — n — k)ty/n + G(s. + t.,)/n, (2.2.3) 


where 


1 t. = d tis and tj; = D> tis. 
j i 


The system of equations (2.2.3) may be solved to obtain 


ti; = CoQ + C1 > Qi; + C2 p Qi; 
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Qi = Ti — Bij-m 

Co = n/q(nk — n — k) 

C, = — 1/q(k — 1)(nk — n — k) 
Cz = — n/gk(n — 1)(nk — n — k). 


The adjusted treatment sum of squares is 


Adj. Treat. 8.8. = 55 ¥ ti;/Co+ a( pe ew é) /n (2.2.5) 
i 3 i j 


If we let B,,; denote the total of the observations in the sth block of the gth 
replication in the fth repetition and R,, denote the total of the observations of 
the gth replication in the fth repetition, the remaining sums of squares required 
for the intra-block analysis of variance are calculated by 


Rep.S.S. = >> >> R},/nk - G' /nkq(k — 1), (2.2.6) 
Blocks eis ae ss.= >>> 2X Bros/n -2r2z R;,/nk, 
Total SS.= | EYED 5 « — G /qnk(k i , 
ee ae pe 
and 


Intra-block error 8.8. = By subtraction. 


2.3 BASIC TWO-FACTOR FACTORIAL IN RECTANGULAR LATTICES 


To incorporate factorial treatment combinations we shall define V;; as the 
ith level of factor A and the jth level of factor C where A has k levels and C has 
n levels. Thus 


Tig = 4 + 5 + Oi; (2.3.1) 


with restrictions 


> ai a 0, > x = 0, > 4; = 0, and > 65 = 0. (2.3.2) 


Substitution of (2.3.1) into (2.2.1) may be regarded simply as a one-to-one 
transformation in the parameter space; see [12]. It follows that 


ty = a; +e; + di; (2.3.3) 


where ¢,;, a;, cj, and d,; are the least squares estimators of 1;;, a, y;, and 8,;, 
respectively. Substituting (2.3.3) into (2.2.5) we may write 


Adj. Treat. 8.8. = gk(n — 1) Da + gk(k — 1) Lei C; 


+ q(nk — n — k) XX di/n. 
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Cochran’s theorem [2] permits us to write 
Adj. 8.8. (A) = gk(n — 1) ¥ ay, 


Adj. 88. (C) = gk(k — 1) Ds, 


Adj. 8.8. (AC) = q(nk — n — k) DD dis/n, (2.3.7) 


with (k—1), (n—1) and (n—1) (k—1) degrees of freedom, respectively, where 
a; = >> ti;/n = ki. (2.3.8) 
j 
cj = > t,;/k = 1.; (2.3.9) 
and 
dij = tis — b;. = 6.4. (2.3.10) 


Table I summarizes the intra-block analysis of variance for a basic two factor 
factorial. 


2.4 INDIVIDUAL COMPARISONS AND MULTI-FACTOR FACTORIALS 


Individual or single degree-of-freedom comparisons are possible in much the 
usual way as for whole block analysis. 


TABLE I. INTRA-BLOCK ANALYSIS FOR A BASIC TWO-FACTOR FACTORIAL 








Source of Variation Degrees of Freedom Sum of Squares 





Replications g(k—1)-1 >> Ris /nk —G /nkg(k —1) 
Blocks within replicates q(k —1)* > = L Bus/n— L Roy /nk 
x Ee, a( e+ EA) 
3; F n 

A-factor (adjusted) (k—1) gk(n —1) rs i. 


Treatments (adjusted) (nk —1) oe. Ds Sees read 
C-factor (adjusted) (n—1) gk(k—-1) D7; 

? 
»s L (tig bi. — 2.5)? 


Co 
Intra-block error kq(k—1)(n—1)—nk+1 By subtraction 


AC-interaction (adjusted) (nk —n —k +1) 








Total qnk(k—1)—1 LELTTeL Viieos 
—G /qnk(k -1) 
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Let — be a k by (k—1) matrix containing (k—1) mutually orthogonal vectors, 
used to transform the a,’s. Let 7 be a n by (n—1) matrix containing (n—1) 
mutually orthogonal vectors, used to transform the y;’s. Contrasts on A-factor 
effects would be 


&=D kun, u=1,---, (k—1) (2.4.1) 


and on C-factor effects, 
te = Do mii p= i,-+-,(n— 2). 
I 


The contrast used to test the hypothesis £,=0 is 


t - bs fiubs. _ 3 ) fiuli;/k, 
: Pi 


‘ 


Adj. S.S. ie = gk(n ge 1) ( Deal.) / >. e. 


: o 
Adj. 8.8. I, = qk(n — 1) ( (ie fut) n? >> Ein. 


The test of the hypothesis n, =0 proceeds similarly. Thus 


J, = , # niet.7 = > a nioti;/N, 
j $ Jj 
Adj. 8.8. J, = gk(k — D (Lami) 7 > mi; 
J P| 


Adj. 8.8. J, = q(k — »( 7. mts) ya 
ry ‘] 3 


The test of the hypothesis (&),,.,=0 is made with the contrast 


(IJ)ue = >, Dd, Eiuttestis, 


and 
Adj. 8.S. (J) us 


iad q(nk —— k) ( ) > gwmiti)* / n z. > (Eiute;)?. (2.4. 10) 


Cochran’s theorem [2] may be used to show that the above adjusted sums 
of squares are independent each with one degree of freedom and F-tests are 
effected using the intra-block error mean square. 
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To incorporate multi-factor factorials all that is necessary is to consider that 
the various levels of the A-factor and the C-factor are themselves factorial 
combinations of other factors. 

If we consider that each level of the A-factor is made up of p factors, Ai, - - - 
Ay, with ki, - - - , ky levels, respectively, where 


Pp 
II ky ” k, 
i=1 


then — may be chosen such that the respective sums of squares of the contrasts 
may be grouped to form Adj. S.S. (A,) with (k,—1) degrees of freedom and all 
the possible interaction sums of squares. These sums of squares may ‘also be 
computed by forming two-way, three-way, etc., tables of values of*é,;, and 
carrying the calculations through as though the /;.’s were individual observa- 
tions, only finally multiplying the resulting sums of squares by the coefficient 
of (2.4.4) gk (n—1). 

Likewise if we consider that each level of the C-factor is made up of q factors, 
Ci, - +--+, Cg, with mn, - + - , levels, respectively, where 


q 
II nj = Nn, 
j=1 


then with the proper selection of 7 the respective sums of squares of the con- 
trasts may be grouped to obtain the adjusted sums of squares of the q factors 
and all the possible interactions. The estimates /,; may also be used to obtain 
sums of squares for the C; factors in the same manner defined for sums of 
squares of A,, followed by multiplying the resulting sums of squares by the 
coefficient gk(k—1). After defining both & and 7, the contrasts used to form 
sums of squares for the interactions of A; and C; follow immediately. 

If one is interested in obtaining only the adjusted sums of squares for the 
multi-factor factorial, a more direct approach is possible. One may arrange the 
estimates, t;;, in a pg way table, (pq—1) way tables, etc. The main effect sums 
of squares and all interaction sums of squares are obtained considering the 
t,;’s as single observations. Then to obtain adjusted sums of squares, all sums 
of squares for A,-factors and all interactions containing only Aj, ---, A,- 
factors are multiplied by the coefficient of gk(n—1)/n. All sums of squares of 
C,-factors and all interactions containing only C, - - - , C,-factors are multi- 
plied by the coefficient g(k—1). All interaction sums of squares involving both 
A and C factors are multiplied by the coefficient of (2.4.10) g(nk—n—k)/n. 
The resulting sums of squares are assigned degrees of freedom in the usual man- 
ner and tests of significance made with an F-test, using the intra-block error 
mean square. 


2.5 ANALYSIS OF A RECTANGULAR LATTICE USING RECOVERY OF INTER-BLOCK 
INFORMATION 

To obtain estimates of 7,; using the recovery of inter-block information we 

assume the model in Section 2.2 with 8,,7’s assumed to be independent normal 
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variates with a mean of zero and equal variances, «3. The normal equations for 
q repetitions of the basic design yield 
tt » '3/ 
wQi; + w Qi; = ql(nk — n — k)w + kw Jtij/n 


pf ' (2.5.1) 
+ q(w — w)(ti. + t.;)/n 


1/o?, (2.5.2) 
1/(o' + nos), (2.5.3) 
and 
Qis = By./n — (k — 1)m, (2.5.4) 
where m is the least squares estimate of yu. Solving (2.5.1) for tj, the estimate of 
rj using recovery of inter-block information, we obtain 
tig = CoP + Ci DY Pi + C2 Le Pas (2.5.5) 
where , 
wQi; + w'Qis, 2.5.6) 
Co = n/q|(nk — n — k)w + kw], 
Ci = (w’ — w)/q(k — 1)w[(nk — n — kw + kw], 


C,=n(w — w)/ql(nk — n — k)w + kw’ |[k(n —lwt+(k- n)w J. 


The estimation of w and w’ is made by first setting up the analysis of variance 
table as given in Table I, however, the computation of adjusted treatment sum 
of squares is more easily obtained by 


Adj. Treat. 88. = © DD W+aLle+aLe, 


instead of using (2.2.5). Table II is then set up to obtain blocks within replicates 
adjusted by subtraction. Then from Table II we obtain estimates of w and w’ as 


w=1/E and w’ = [(k—1q—1)/[(k—DqB-—E]. (2.5.7) 


Table II summarizes the inter-block analysis. 

With the assumption of w and w’ being known without error, Rao [15] 
obtained a test of the hypothesis, Ho: r1.= +++: =rne for the estimates of 1;; 
using recovery of inter-block information based on the statistic 


xT = ie he ti(wQi; > w'Qi;) 


where x7 has (nk—1) degrees of freedom. This test is an approximate test if w 
and w’ are not known without error, as the test is only asymptotically distrib- 
uted as a Chi-square variable. Hence, the estimation of w and w’ should be 
based on a large number of degrees of freedom. 
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TABLE II. INTER-BLOCK  ANALYOSS FOR RECTANGULAR LATTICE 








ee Mr Mean 
Source of Variation Degrees of Freedom Sum of Squares 
Square 





Replication q(k—1)-1 From Table I 
Blocks within replicates (adjusted) q(k—1)? By subtraction 
Treatments (nk —1) ¥ E Tii/ak-1) 





—G'/(k—1)kgn 
Intra-block error kq(k—1)(n—1)—nk+1 From Table I 


Total ee qnk(k—-1)—1 From Table I 








2.6 ANALYSIS OF FACTORIALS USING RECOVERY OF INTER-BLOCK INFORMATION 


To facilitate the introduction of factorials we write (2.5.8) as 


xe = {alk 0 w+ WEDGE 
+ q(w — w(x t+ xt) /», (2.6.1) 


by using (2.5.1). As before, we first consider a basic two factor factorial and 
write (2.3.1). It then follows that 


ti; = a; + ¢; + di; (2.6.2) 


where {¢j,, aj, c;, and dj, are the least squares estimates using recovery of inter- 
block information of 7;;, ai, y;, and 6,;, respectively. Substituting (2.6.2) in 
(2.6.1) yields 


= q[k(n — 1)w + (k — n)w’] Say + gk(k — 1)w ey 
F i 


nk —n — k)w + kw’ 
‘ ql | 





Lda 

wi 

q(w’ — w)C 
Ci 


1 
a ay + gk — Iw ret oh hd 
0d j 


Cochran’s theorem [2] enables us to write 


> C ‘ 
qlk(n — 1)w + (k — n)w | Lai _ gw —_ 0 off 
2 


qn(k — 1)wC; 2 


—_—— a; P 


CY i 
gk(k — tw e}, 
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_ alll nk — m — k)w + kw’ 


n 


l r 
Lda -~bba 
40 4 I 


with (k—1), (n—1), and (n—1)(k—1) degrees of freedom, respectively, where 


I — 
- >> ty = &., (2.6.6) 
n j 


] ’ -/ _- 
Da hi (2.6.7) 


2.6.8) 
2.7 INDIVIDUAL COMPARISONS AND MULTI-FACTOR FACTORIALS USING RECOVERY 


OF INTER-BLOCK INFORMATION 


Individual comparisons using recovery of inter-block information are possible 
in much the same way as indicated in Section 2.4. 

The contrasts in (2.4.3), (2.4.6), and (2.4.9) are altered by replacing /;., 
i.,, and t,; by Y., #;, and ti, respectively. Then to test the hypotheses 


> Ena; = 0, ee Mei¥; = 0, and 7. : 2 Eiutej6i; = O, 
J ‘ a] 


. 


the following are used. 


9 Cé q(w’ — w) lina 2 
1 > - C ; (Leak) / De 


2 q(w’ — w)Co »\? 2 
tS a ( y ZL but’) / Z Fius 


n 2C 


= gk( &-w(X Mojt ) [Des 


xs, = g(k — Dw ( pp > nots) wD Wo 


9 


2 l , 
XU ug = ~ 1? ( »» oe Ew) / p . ® (Eiume;)?. 
0 i j L J 


Multi-factor factorials using recovery of inter-block information may be used 
as defined in Section 2.4. 
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2.8 VARIANCES AND COVARIANCES 


The variances and covariances of the estimators utilizing recovery of inter- 
block information may be obtained by considering equation (2.5.5) which 
may be written as 


i; = (Cot Cit CdPy + Ci > Pes + Cs D> Pw (2.8.1) 


vei ii 
where (4, Cj, Cj and P;; are defined in equations (2.5.6). Then 
Var(ti;) = Co+ Cit C2 — A, .8.2) 
Cov(ti;, tj) = Ci — A, ii 2.8.3) 
Covitis, tis) = C2 — A, ji 2.8.4) 
and 
Cov(ti;, ty) = — A, i 4#i,jAj 2.8.5) 
where 
ee ates 2.8.6) 
The variances and covariances of the aj’s, cj’s, and djs are obtained using 
equations (2.6.6), (2.6.7), (2.6.8), (2.8.2), (2.8.3), (2.8.4), and (2.8.5), as 
Var(a;) = [(k — 1)Co + n(k — 1)C2]/nk, (2.8.7) 
Var(c;) = [(n — 1)Co + k(n — 1)Ci]/nk, (2.8.8) 
Cov(a;, ay) = — (Co + nC2)/nk, (2.8.9) 
Cov(c}, cv) = — (Co + kC3)/nk, (2.8.10) 
Cov(ai, c;) = 9, .8.11) 
Var(a; — a) = 2(Co + nC2)/n, .8.12) 
Var(c; — cy) = 2(Co + kC1)/k, .8.13) 
Var(d;;) = (nk — n — k + 1)Co/nk, .8.14) 
Cov(di;, dij’) = — (k — 1)Co/nk, .8.15) 
Cov(di;, dv;) = — (n — 1)Co/nk, 8 
8 
8 
8 


Cov(di;, dj’) = [Co — (2nk — n — k)A]/nk, 
Cov(ai, di;) = 0, 
Cov(c;, di;) = 0. 


If the variances and covariances are desired for estimates based on the intra- 
block analysis only, set w=1, and w’=0 in the above equations and multiply 
the results by o”. 
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A QUARTERLY ECONOMETRIC MODEL OF THE UNITED STATES 


LowE.Lu E. GALLAWAY AND Paut E. Smit 
San Fernando Valley State College 


Simplicity in an econometric model may often be a virtue provided 
that the model performs its particular predictive function in a reason- 
ably competent manner. The following model, while minimizing com- 
putational difficulties, demonstrates fairly accurate forecasting ability 
and a partial explanation, via an accelerator, of postwar fluctuations 
in the gross national product of the United States. 


N ECONOMETRIC model may be constructed either for the purpose of verify- 
A ing a hypothesis about economic theory by statistical techniques or for 
making forecasts about future events, based on the known values of the sys- 
tem’s exogenous and predetermined variables and the calculated parameters 
of the model. The following simple model of the United States economy, which 
utilizes only predetermined or lagged independent variables, is based on 
quarterly data and mainly falls into the second category.' No attempt is made 
at any profound theoretical or statistical refinements, although the relationships 
of the model do not serious conflict with accepted economic theory. 

The variables to be included in the model are as follows: 


Y =gross national product 
Yz=disposable income 
I =gross private domestic investment 
C =personal consumption expenditures 
R=property income before taxes* 
G=government expenditures on goods and services plus net foreign in- 
vestment 
M =demand deposits, currency, and time deposits in the hands of the public 
at the beginning of the quarter 
u=an error term 





1 Due to the time lag involved in collecting data the model consists largely of an attempt to predict what is 
currently taking place, the prediction being based on the values of predetermined variables, which are themselves 
unrevised and subject to considerable error. 

2 The property income data were derived by dividing national income into labor income W and property in- 
come R. Labor i is defined as comp tion of employees A plus the portion of proprietor’s income U that 
may be ascribed to labor. Property income is defined as rental income of persons X, plus corporate profits and 
inventory adjustment P, plus net interest Z, plus the portion of U that may be ascribed to property. The basic 
problem is deciding how to apportion proprietor’s income between property and labor. This was done on the basis 
of the assumption that proprietor’s income would be divided between property and labor the same as non-proprie- 
tor’s i For ple, property income for any one period is indicated by the following equation: 


R=X¥+P+Z+kU 








where 
X+P4+2Z 
K-«= 


X4+P4+Z4+A4 


An alternative method of estimating the property component of income of unincorporated enterprises would 
be to impute a wage equal to that of a typical worker in the industry involved to each proprietor and subtract the 
sum of these imputations from the total income. This was rejected by the authors on the grounds that the labor 
of the proprietor and the labor of a typical worker are not necessarily homogeneous. 





379 
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The time period to which the value of any particular variable applies shall be 
denoted by the subscripts, ¢, —1, ete. 

The gross national product identity is given by the familiar expenditures 
equation: 


Y,= Cr+ 1it Gi. (1) 


Current consumption expenditures are determined by lagged disposable in- 
come and the money supply at the beginning of the quarter, i.e., 


C, = ao + aiYa,_, + a2.M.+ uu, (2) 


and gross private domestic investment is related to a gross national product 
accelerator and property income, both lagged one period. Thus 


I,= bo + bi( Yer _ Y 1-2) + beRir + U. (3) 


Finally, government expenditures and net foreign investment are assumed 
to grow at a constant percentage rate in the form: 


G; = go + giG ir + Ut. (4) 


Those econometric models which incorporate annual data, e.g., Klein- 
Goldberger, inevitably treat government expenditures as an exogenous vari- 
able. The predictions for the Klein-Goldberger model, for example, are made 
in the fall of the year for which the forecast is being made, and the estimation 
of government expenditures for the year is based upon available data for the 
first two quarters and an intuitively reasonable random guess for the remainder 
of the year.* Equation (4) is not boldly meant to imply that a causal relationship 
necessarily exists between current and lagged government expenditures, al- 
though some argument can be made that such is partly the case, but rather to 
provide an unambiguous and strictly objective basis for predicting the level of 
activity in the government sector.* 

Utilizing seasonally adjusted data in current prices for the time period 
from 1948 through 1957, the structural equations, all of which clearly are 
identified, were estimated by the simple least squares method.’ Moreover, in 
order to reduce the extent of autocorrelation and multicolirsarity, the equa- 
tions were estimated in terms of first differences.* The results were 





+ Klein, L. R. and Goldberger, A. 8., An Econometric Model of the United States: 1929-1952 (Amsterdam: 
North Holland Publishing Company, 1957). 

4 Such objectivity is needed i h as the f ts of this model, as well as those of Klein-Goldberger, are 
extremely sensitive to government expenditures. Moreover, in the budgetary process, expenditures in the final 
quarters of any fiscal year are clearly somewhat dependent upon expenditures in the earlier quarters, given total 
planned outlays for the year, and such projects as are continued over several time periods would suggest the exist- 
ence of a relationship between expenditures over several time periods, 

5 Justification for using seasonally adjusted data is that some of the time series display markedly different 
patterns of seasonal variation. Moreover, the authors were in agreement that, inasmuch as the model is designed 
primarily as a forecasting tool, upturns and downturns should be predicted without their being clouded over by 
normal seasonal changes. 

An alternative method of handling the problem of seasonal variation is presented in Suits, D. B., “Use of 
Dummy Variables in Regression Equations,” Journal of the American Statistical Association, 52, (1957), pp. 548-51. 

* Although colinearity between the independent variables does not harm the model's effectiveness as a fore- 
casting tool, it does lead to biased estimates of the regression coefficients and large standard errors of estimate. 
For example, if Equation (2a) is solved in terms of absolute values, the regression coefficient for the money supply 
becomes negative and is smailer than its standard error, a result which is difficult to justify on theoretical grounds. 








A QUARTERLY ECONOMETRIC MODEL 
AC, = 0.09 + 0.434Y4,, + 0.234M,, R? = 
(0.15) (0.13) 
AI, = 0.08 + 0.43A(¥i-1 — Yes) +0.48ARi1, R? 
(0.13) (0.22) 


AG; = 0.13 a 0.67 AG 1, 
(0.12) 


Changes in gross national product are hence given by 


AY, = 0.30 + 0.43AYa_, + 0.23AM, + 0.43A(Y¥i1 — Yer) + O.43ARi1 
+ 0.67A4G;_1. 


The values of C, IJ, and G were estimated by adding the first differences 
calculated from the regression equations to the previous quarter’s C, J, and G. 
In turn these were summed to arrive at an estimate of gross national product, 
the results being shown in Table 1. The residual or unexplained variance, ex- 
pressed as a percentage of the total variance of gross national product was 0.6 
per cent,” 

The Durbin-Watson test for serial correlation, i.e., whether successive value 
of the unexplained residuals are correlated, was applied.* The computed values 
of d and 4—d were 2.44 and 1.56, respectively. Since neither is less than the 
lower limrit but one lies between the two limits at the one and five per cent 
probability levels, the test was inconclusive. 

Insofar as the properly timed and successful implementation of government 
policy is concerned, perhaps the most important tests of any forecasting model 
are whether it correctly predicts the direction of change and whether it is 
accurate in forecasting turning points in the value of the variable which is 
being considered.® In order to have a basis for comparison, we shall contrast 
the model with a naive forecasting model which assumes that the direction of 
change is always positive, i.e., gross national product steadily increases over 
time. 

The naive model is incorrect with respect to the direction of change in nine 
out of the thirty-nine cases, whereas the model is wrong only five times. From 
necessity, moreover, the naive model is unable to detect turning points in the 
actual data, but the model correctly predicts two of the three down-turns on 
schedule and errs on the other downturn and the two upturns only in that it 
forecasts them a quarter after they actually occur. 





7 It must be pointed out this unusually low result is not due entirely to the “goodness” of the regression equa- 
tions. The error terms for each of the three structural equations tend to offset each other so that the percentage of 
explained variance for gross national product is higher than that for the equations explaining C, J, and G. 

§ Durbin, J. and Watson, G. S., “Testing for Serial Correlation in Least Squares Regression,” Biometrika, 38 
(1951), pp. 159-77. 

* For more on this, see Theil, H., Economic Forecasts and Policy, (Amsterdam, 1958), Chapters 2 and 4. 
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TABLE 1. ACTUAL AND ESTIMATED GROSS NATIONAL PRODUCT, 
SEASONALLY ADJUSTED BY QUARTER IN 
CURRENT PRICES, 1948-57 


(in billions) 








Year and Predicted Year and Actual Predicted 
Quarter Quarter 





1948—1 ‘ 1953—1 


2 
3 
4 


1954—1 


89. 




















Source: Survey of Current Business. 


TABLE 2. ACTUAL AND PREDICTED GROSS NATIONAL PRODUCT, 
SEASONALLY ADJUSTED BY QUARTER IN 
CURRENT PRICES, 1958 


(in billions) 








Year and Actual Predicted 
Quarter Y Y 





1958-1 107.8 109.4 
2 108 .6 106.7 
3 111.0 111.5 
4 114.3 114.2 





441.7 441.8 
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Using data which is included in the computations of the regression equations 
as a means of testing the model is not a strictly accurate measure of the model’s 
reliability. Rather, it is preferable to test the model’s predictive merits for 
time periods outside of those which are included in the model. For this purpose, 
the model was applied so as to forecast gross national product for the four 
quarters of 1958, the results being shown in Table 2. It should be noted that 
the model correctly predicts the direction of change in three quarters. 

Although the model presented here is admittedly extremely simplified and 
overly aggregated, it does satisfy an important test of econometric models in 
that it does a reasonably successful job of prediction. Moreover, the discovery 
of a gross private investment accelerator in a quarterly model helps to provide 
a partial explanation for the three to four year oscillation observable in United 
States gross national product data since World War II.'® 





1 An examination of the model reveals that its properties include a convergent oscillation with a duration 
of about eleven or twelve time periods. 
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of Modern Physics, and Journal of the American Statistical Association. 

MANKAL NARASIMHA MURTHY, 29, is Head of the Design Section in the De- 
partment of National Sample Survey of the Indian Statistical Institute. Murthy received 
his M.A. degree in Statistics from the University of Madras in 1954 and received further 
training in professional statistics for one year at the Indian Statistical Institute. In 1960 
he spent seven months in the United States studying developments in the field of data 
collection and compilation. Articles by Murthy on sample surveys and quality control 
are published in Sankhyd and in the Journal of Indian Society for Quality Control. 

JEAN LIBERTY PENNOCK, 51, has worked in the United States Department of 
Agriculture since 1942 and is currently a leader of family living investigations in the 
Household Economics Research Division. Educated in history at Connecticut College 
for Women (B.A., 1933; M.A., 1937), Miss Pennock’s major interest is now in consumption 
economics. She has written a number of bulletins and articles on family living expendi- 
tures, including a previous article with Carol Jaeger that was published in the June, 1957 
issue of JASA. 

DOUGLAS SHERMAN ROBSON, 36, Associate Professor of Biological Statistics 
at Cornell University since 1955, majored in statistics at Iowa State University and 
Cornell (Ph.D., 1955). His earlier articles have appeared in various statistical and bio- 
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metric journals including Annals of Mathematical Statistics, Genetics, and Biometrics. 
Robson initially became interested in ratio estimation while analyzing corn uniformity 
data in 1950. The present paper with Miss Vithayasai contains a simpler expression for 
the variance of a ratio estimator than he developed in the December, 1957 J ASA article, 
“Applications of Multivariate Polykays to the Theory of Unbiased Ratio-Type Es- 
timation.” 

EUGENE ROGOT, 34, is Analytical Statistician in the Biometrics Branch of the 
National Institute of Neurological Diseases and Blindness, National Institutes of Health. 
Before taking this position in 1958 he was Biostatistician (1953-6) and Senior Biostatistician 
(1956-7) in the New York State Mental Health Research Unit, Syracuse. He holds a 
B.S. in Psychology from City College of New York and an M.A. in Mathematics from 
Syracuse University. Rogot’s earlier articles have appeared in medical journals. 

YECHEZKEL HENRY RUTENBERG, 32, is a Staff Assistant, Headquarters 
Manufacturing, Westinghouse Electric Corporation. From 1953 to 1955 he was a Mechani- 
cal Engineer with “Nechushtan Elevators,” Tel Aviv, Israel, and from 1955 to 1956 an 
Industrial Engineer with the Industrial Service Corporation, Haifa, Israel. Rutenberg 
received his B.S. (1953) in Mechanical Engineering at Israel Institute of Technology and 
his M.S. (1958) in Operations Research, Case Institute of Technology. During his graduate 
studies he was a Graduate Assistant, first with the Computing Center and Statistical Lab- 
oratory and later with the Operations Research Group, at Case Institute of Technology 
(1957-61). 

VINOD KARAN SETHI, 30, worked as a technician in the Design Unit of the 
Indian Statistical Institute’s National Sample Survey from 1955-8. In 1958 he became 
Assistant Professor in the Institute of Social Sciences, Agra University, a position he still 
holds. His initial training in Physics and Mathematics at Agra (B.S., 1951; M.S., 1953) 
was followed by two years of statistical training at the Indian Statistical institute. Sethi’s 
principal interests are in sample surveys and estimation. A previous article, “Some 
Sampling Systems Providing Unbiased Ratio Estimators,” with N. 8. Nanjamma and 
M. N. Murthy, appeared in Sankhyd, vol. 21, pp. 299-316. 

PAUL EDWARD SMITH, 34, serves as Assistant Professor of Economics at San 
Fernando Valley State College, California, while working on his Ph.D. thesis in Economics 
for submission to the University of Michigan. He previously studied at San Diego State 
College and the University of Virginia. He is co-author (with Lowell E. Gallaway) of an 
article, “Real Balances and the Permanent Income Hypothesis,” to appear in the Quar- 
terly Journal of Economics. 

ROBERT FLEMMING TATE, 39, wrote an article, “Optimal Confidence Intervals 
for the Variance of a Normal Distribution,” with G. W. Klett, for the September, 1959 
issue of the Journal. His biographical note appears on p. 698 of that issue. 

GERHARD TINTNER, 53, is Professor of Economics, Mathematics, and Statistics 
at Iowa State University. Educated in economics, statistics, and law at the University 
of Vienna, he later visited Harvard, Columbia, California, Stanford, Cambridge, and 
Institut Henri Poincaré in Paris on a post-doctoral Rockefeller Fellowship. He held 
positions at the Austrian Institute of Trade Cycle Research and the Cowles Commission 
for Research in Economics before going to Iowa State in 1937. 

Tintner has lectured widely in Europe on several occasions and worked in the De- 
partment of Applied Economics at Cambridge in 1948-9. He has served as a consultant or 
associate of the United States Department of Agriculture, the Office of European Eco- 
nomic Research, and the Office of Strategic Services. He is a Fellow of the Econometric 
Society, the American Statistical Association, and the Institute of Mathematical Statistics 
and serves on the editorial boards of Econometrica and Metro-Economica. Tintner is the 
author of Prices in the Trade Cycle, Vienna: Springer, 1935; The Variate Difference Method, 
Bloomington, Indiana: Principia Press, 1940; Econometrics, New York: Wiley, 1952, 
1955; and Mathematics and Statistics for Economists, New York: Rinehart, 1953, 1954; 
as well as more than sixty articles in professional journals. His last previous article in 
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JASA was “Some Applications of Multivariate Analysis of Economic Data” which ap- 
peared in the September, 1946 issue. 

CHESTER WILLIAM TOPP, 44, received his B.A. at Huron College and his M.A. 
at the University of Illinois, both in Mathematics, before taking his Ph.D. in Mathe- 
matical Statistics at the Case Institute of Technology in 1951. He has held his present 
position as Professor of Mathematics at Fenn College since 1941. Topp is co-author 
(with Fred Leone) of an article, “A Family of J-Shaped Frequency Functions,” which 
appeared in the March, 1955 issue of the Journal. His other articles have appeared in 
Industrial Quality Control and American Mathematical Monthly. 

CHITRA VITHAYASATI, 28, majored in mathematics at Chulalongkorn University, 
Thailand before undertaking graduate work in Statistics at Cornell. Her present article 
with D. 8. Robson is part of a Master’s Thesis prepared under Robson’s supervision. 

COLERIDGE A. WILKINS, 30, has been Lecturer in Mathematics at the Univer- 
sity of New South Wales since 1959. He formerly taught at the University of Auckland. 
Wilkins holds the degree of Master of Science in Mathematics from the University of New 
Zealand. His main field is Topology but he also has strong interests in Statistics. His 1957- 
8 academic year was spent at the University of Notre Dame on a Fulbright Scholarship. 
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Opportunities for statisticians with the U. 8. Government in the international field are primarily in conjunction 
with the technical assistance programs in the underdeveloped countries. The positions are mainly two types: 
(1) statistical advisors to recipient governments to assist in establishing and/or improving statistical systems in 
particular fields, or (2) statistical analysts on the U. 8. Operations Mission (International Cooperation Administra- 


tion) staffs involving implementation of the aid program. 





Under the U. S. integrated programs, the statistici function as bers of teams. Usually the statistical 
work is part of a broader undertaking such as economic planning, establishment of central banking, overall improve- 
ment of government operation, or a substantive program such as labor statistics, population and vital statistics, etc. 

Statisticians must have broad academic background and operational experience of responsibility, plus in- 


genuity for adaptation of know-how. Assignments are both on regular foreign service two-year tours or short 
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terms of a few weeks or months. Recruitment is usually through U. 8. statistical agencies such as the Bureau of 
Census, Bureau of Labor Statistics, or the National Office of Vital Statistics. 


Recent Developments in Federal Criminal Statistics. Dana M. Barsovur, Office of Statistical Standards, Bureau 
of the Budget. 


Several improvements have been made during the last few years in the criminal statistics—police, judicial 
and correctional—issucd by the Federal government. In the field of police statistics recommendations of the Con- 
sultant Committee on Uniform Crime Reporting have been, or are being, adopted by the F.B.I. These include 
expansion of coverage, particularly for rural areas, use of more current population figures, and a revision in the 
grouping of offenses known to police in an effort to better measure serious crime. The publication of an “index” of 
crime in the United States, or even an “index” for cities over 25,000, is, however, of questionable validity. 

In the fields of judicial and correctional statistics the Children’s Bureau began in 1956 to publish data based 
on its new national sample of about 500 juvenile courts, and the Bureau of Prisons is making progress towards 
getting its detailed national prisoner statistics program on a current basis. Proposals for other improvements are 
under consideration, including plans for better integrated statistics on Federal crimes. 

Nevertheless, there are stili major gaps in our statistical intelligence about crime and criminals. Filling some 
of these gaps may depend on further progress by the state statistical agencies. 


A Study of Validity in Reporting Medical Care in Michigan. Rosin Bartow, James N. MorGan anv Grover C. 
Wraick, Jr., University of Michigan. 


This paper reports three validity checks made of the use of medical services in Michigan as reported in a 
survey conducted jointly by The University of Michigan Study of Hospital and Medical Eeonomics and The Survey 
Research Center. The Study is the result of The Governor’s Study Commission on Pre-Paid Hospital and Medical 
Care Plans, and is financed by the Kellogg Foundation. 

Screening questions were used to increase precision by augmenting the sample for families with aged members 
and families with high medical expenses (someone hospitalized during the year). This was combined with a normal 
cross-section sample, by appropriate weighting, and bias was largely removed from the estimates by this procedure. 

Respondents were asked only for insurance policy identification, and the insurance carriers were asked for 
details of the coverage. 

Type or specialty of medical practitioners visited was not asked, but was noted when volunteered (about 
three-fifths of cases). Names and addresses of practitioners were used to verify types and specialties through medical 
directories. 

Several aspects of reported hospital stays were checked in one of two ways. Blue Cross records were searched 
for all persons reporting such coverage, and for others, hospital stays were verified directly with the hospital named. 

For each check, verified data were compared with reported data by various contro) variables, and conclusions 
are drawn on the accuracy of survey data. 


Measuring Productivity in Marketing. Turopore N. Beckman, The Ohio State University. 


Explains nature of productivity and marketing segments to which applied conceptually and in measurement 
computations. Measurements developed for wholesalers and manufacturers’ sales branches on wholesale level and 
for total retail trade, as well as food stores and general merchandise stores. Principal productivity ratios for these, 
in terms of constant (1958) sales dollars as output and man-hours of all persons employed as input, presented for 
the Census of Business years since 1935. Problems in determination of both output and input indicated throughout. 

For wholesalers, productivity ratios also developed for capital input factor as well as for labor and capital 
inputs combined. For the latter, a formula is presented to facilitate the combination of productivity ratios for two 
or more inputs into a single ratio. 

While all the measurements developed and presented are on an aggregate national basis, a real breakthrough 
in solving the numerous conceptual and measurement problems can be achieved best and perhaps only by studying 
productivity at the firm level. Experimental work to-date proves this feasible, on a pilot study basis, followed by a 
comprehensive research project with a representative sample large enough to make possible building of aggregates 
by industries and trades, ete. Considering vital importance of productivity and its measurement, this would seem 
amply justified. 


A Generalized Guarantee Policy. Luoyp F. Bex, Stanford University. 


Theory is formulated for a policy by which an item, which faila before expiration of its guarantee period, is 
replaced at a cost to the customer equal to an arbitrary fraction, K, of the original cost prorated over the guarantee 
period. Such a policy includes, as special cases, the two commonly used procedures under which the item is replaced 
at no cost to the customer, K =0, and under which the full original cost is prorated over the guarantee period, 
K =1. It is assumed that the number of customers the vendor has is an increasing function of the length of the guaran- 
tee period and a decreasing function of the prorated replacement cost. 

Characterization of values of the length of guarantee and prorated cost function, K, which are optimal from 
the standpoint of the vendor is considered. Cases in which the items fail at times distributed according to the gamma 
and beta families of probability distributions are treated. 


Long Range Prospects for the Stock Market. Pamir H. BuaispE.x, Stein Roe & Farnham. 


Over the next ten years the market will continue its upward trend but the relative rise will not be as large 
as in the last ten years. 

Based on projected earnings, dividends, and book value, the Dow-Jones should reach 1000 in 1970, with 750 
representing a low market, and perhaps 1200 to 1300 representing a high market. Higher levels could be achieved, 
but this would require modification of currently accepted investment fundamentals. 
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Changing technology will support continued high price stocks, some of which will be disappointing. Technical 
factors affecting the demand and supply for stocks will support higher levels. The projections assume a favorable 
environment for private capital, including a continuation of a political trend toward the center. It is assumed that 
economic growth will be sufficient to support growth in earnings despite the probability of reduced margins. Inflation 
will continue to be a problem but not to the degree that it has been in the recent post-war period. The projection 
assuntes that we will have no deep and extended periods of economic decline. 


On Methods of Constructing Sets of Mutually Orthogonal Latin Squares Using a Computer. R. C. Boss, I. M. 
CHAKRAVARTI AND D. E. Knura, Unitersity of North Carolina and Case Institute of Technology. 


This is a continuation of the work presented under the same title at the Midwestern Regional meeting of IMS 
this year. The method is to start with module G(2, 2/) whose elements are vectors t =(a, b) where a is a residue class 
(mod 2) and b is a residue class (mod 2t), the addition being defined by (ai, b:) +(a2, b:) =(c, d) where ai +a: =c 
(mod 2), +b: =d (mod 2t) and where P:[z;] =a; and P2[xj]=b; and (0, 0), and (0, 1),+++, (0, 2¢—1), (1, 0) 

+++ (1, 2t—1) is the standard order. The existence of a set of m mutually orthogonal Latin squares based on a 
module G is known (Mann 1942) to be equivalent to the existence of a matrix X;,,4¢((x¢j)) whose rows are elements of 
G and amongst the 4¢ differences of any two rows every element of @ occurs once. The existence of Xm.«; implies 
the existence of Am,« =(aij)) =((Pi [xij })) where a;j =0 or 1 and in every two-rowed submatrix of A the four pos- 


= C.-C) =() 


occur as columns with equal frequency ¢. Starting with such a matrix A»,« which exists whenever a Hadamard 
matrix of order 4t exists, a programme was written for adjoining a second coordinate bj; to every aij, where bj; 
belongs to the ring of residue classes (mod 2t), so that a matrix Xj,,4¢ =((aij, 6;j)) could be obtained. For ¢ =3, this 
method yielded m=5 mutually orthogonal Latin squares of order 12. These results have also been generalized in 
other directions for different orders. 











Cc pts and Computati Probl in Seasonal Analysis. George J. Brass anp Pamir J. Bourque, Uni- 
versity of Washington. 


Before the seasonal element in economic time series can be isolated it is essential to achieve a clear conception 
of the component sought. The seasonal component is universally conceived as a periodic fluctuation but there are 
differences of opinion as to the properties which characterize its periodicity. In addition, the interrelationship of 
the seasonal element with other sources of variation in time series (trend, cycle, and irregular variations) necessi- 
tates a holistic approach in conceptualizing components. Conceptual differences cannot be resolved by reference 
to empirical results because each particular method of decomposition reflects a particular conceptual pattern and 
adequate criteria for evaluation of results have not beer established. 

The widely used ratio-to-moving-average method for isolating seasonal fluctations is subject to an important 
bias. The moving average fails to follow the cyclical component through cycle peaks and troughs and a part of the 
cyclical element is left in the seasonal ratios. The discussion of computational techniques directed at this and other 
problems includes a critical analysis of the deseasonalization programs developed at the Bureau of the Census and 
the Bureau of Labor Statistics. 


Electronic Computer Programs for Business Cycle Analysis. Gernarp Bry, Rutgers University and NBER, and 
Cuar.totre Boscnan, NBER. 


This paper presents a survey of computer programs specifically designed for the purposes of business cycle 
research, and in operation or in process of preparation at the National Bureau of Economic Research. The paper 
will deal with the following programs, their major applications, and their shortcomings. 

(1) Seasonal Adjustments and Analysis of Time Series Components. 

The discussion will concentrate on the measures of cyclical amplitudes and related ratios, which are pro- 
vided by the Shiskin-Eisenpress program. 

(2) National Bureau “Standard” Business Cycle Analysis of Time Series. 

The discussion wili deal with the general usefulness of an analytic tool which was hitherto used primarily 
within the NBER. 

(3) Recession and Recovery Analysis. 

This program provides the computational basis for Geoffrey H. Moore’s recent work concerned with deter- 
mining the current position within the fluctuations of business activity. 

Analysis of Short Cycles. 

A program was devised to test the significance of the degree of synchronization of subcycles among time 
series. 

Diffusion Indexes and Dispersion of Changes. 

The paper will discuss the nature of the analysis, its general usefulness for business cycle analysis, and some 
recent findings derived by its applications. 

(6) Analysis of Distributions and their Cyclical Changes. 

Cyclical changes in the characteristics of distributions could in the past scarcely be analyzed, because of the 

overwhelming amount of computational work involved. A new program provides basic measures of single 

distributions and of frequency distributions, including various measures of inequality. 
(7) Index Number Program. 

This program provides a battery of index numbers, derived from the same basic data. It permits judgment 

on the extent to which the measures of cyclical characteristics of an economic activity are affected by the 

type of index number used. 
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(8) Analysis of Interrelated Cyclical Changes. 
This program is still in the planning stage. It aims at a device describing the interrelation of two or more 
time series during various portions of the business cycle. 


Some Solutions to the Problems of Designing Samples of Client Records at Financial Institutions. James C. Byrnes, 
Board of Governors of the Federal Reserve System. 


The problems ot designing samples of client records at financial institutions often involve the following re- 
quirements: simple, practical, and uniform instructions for subsampling records filed in a variety of ways; estimates 
of the change over time in population characteristics which involve sampling a high proportion of identical units 
over time; and methods of processing large amounts of low cost-per-unit data to produce a heterogeneous array 
of statistics and their variances. The use of an alphabetical clustering device permits the adaptation of designs 
described by Deming and Hansen, Hurwitz, and Madow for sampling whole clusters from primary units selected 
with probability proportionate to size within an over-all context of single-stage thecry. One solution requires whole 
clusters to be selected from proxy populations, such as a large suburban telephone directory, prior to trans- 
mittal of instructions to respondent institutions. The sampling of whole clusters defined in advance meets both the 
requirements for simple subsampling instructions to be applied in a variety of situations, and the requirement for 
sampling a high proportion of identical units over time. The use of single-stage theory meets the requirement for 
simplified estimates of a large number of population characteristics and their variances. 


The Dangers of Abstracted Empiricism. Wuutam Caprrman, The Center for Research in Marketing, Inc., and 
Communications and Media Research Services, Inc. 


Today’s businessman feels that when a number is assigned to a fact, the fact becomes meaningful and provides 
him with a sense of security. He does not regard a set of statistics as an approximation of reality, but as reality 
itself. Statisticians, too, seldom examine the relationship between the complex mass of raw figures they have trans- 
formed to an organized whole and what is happening in human terms. This is abstracted empiricism, the love of 
the numerical result for its form and beauty. 

Testing advertising effectiveness provides a framework for discussing the dangers inherent in abstracted sta- 
tistics, since a range of statistical measurements are here accepted and employed whose relationship to reality has 
never been explored and are divorced from it and abstract in a meaningless way. We must begin to understand 
more accurately what advertising accomplishes; how people respond to it; how it relates to other communication 
methods; how it affects behavior. Then a more reasonable set of testing procedures must be developed. 

The power and influence of his work requires the statistician to make clear its limitations on the one hand, 
and on the other to be more concerned with its relation to human experience. 


Concepts and Uses of Price Indexes. Anno.p E. Cuase, U. S. Bureau of Labor Statistics. 


The major functions of price indexes are divided into three principal categories: (1) guide to maintenance of 
economic equities, (2) deflation of value aggregates to estimate physical quantities, and (3) general economic in- 
telligence. These functions are described briefly and the general usefulness of the B.L.S. Consumer Price Index and 
Wholesale Price Index for these purposes are evaluated. Certain limitations in scope and concept of the present 
indexes serve the basic purposes for which they were established reasonably well. Some important unmet needs for 
price indexes are indicated, however, as evidenced in part by public misuse of the present indexes. A few suggestions 
are made for additional types of indexes at both consumer and primary market levels. Improvements in both the 
Wholesale Price Index and the Consumer Price Index now being planned and carried out are described briefly. 


Capital Appropriations and Plant and Equipment Expenditure Expectations. Morris Conen, National Industrial 
Conference Board. 


Quarterly changes in manufacturing appropriations are generally associated with changes in profits and 
capacity utilization, either coincidentally or with a lag of one quarter. The formally recorded capital appropriation 
represents the basic capital spending decision variable. 

The historical record for the past seven years is now clear—the series leads capital expenditures at turning 
points and affords an appreciation of magnitude of change. At the trough, the lead has been two quarters between 
seasonally adjusted appropriations and expenditures. At the peak, the lead has been longer, about four to five 
quarters. 

Supplementing the basic dollar appropriations data for purposes of forecasting are: (1) adjusted changes in 
unspent appropriation backlogs; (2) appropriation commitments; (3) two backlogs to uncommitted appropriations 
and unspent commitments; (4) diffusion of appropriations, backlogs, and spending; (5) cancellations; (6) the break- 
down between appropriations for plant and equipment. 

In addition, the unique quarterly capital appropriations series for thirty-eight separate metalworking industries 
provides the basis for specific industry analysis. 

Finally, the forthcoming capital appropriations series are discussed for electric and gas utilities and for the 
foreign operations of large manufacturing concerns. 


Two Method II Research Topics. Micuarn J. Conton anp ALtan H. Youna, U.S. Bureau of the Census. 
“Adjustment of Weekly and Ten-Day Series in Economic Forecasting,” by Michael J. Conlon. 


Weekly and ten-day series as compared with monthly series as an aid in forecasting business conditions, 
Availability of intra-month series; serious lacunae. Timing and stability of intra-month series. Types of seasonal 
and irregular movement in intra-month series: the seasonal movement proper, regular intra-month movements, 
model-year movements, exaggerated magnitude of distortions due to irregular events and holidays. Special problems 
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in seasonally adjusting intra-month series. Use of an electronic computer program primarily designed for monthly 
series; evaluation. Other methods. Recommendations: in regard to future collection, in regard to seasonal adjust- 
ment. 





“Seasonal Factor Revisions,” by Allan H. Young. 


An investigation of the amount and causes of revisions in seasonal factors when a series is readjusted with an 
additional year of data using the Univac Method II program. Revisions arising from the extension of factor curves, 
the extension of the trend-cycle curve, the designation of extremes and variation in the preliminary seasonal adjust- 
ment. Suggestion for minimizing revisions. 


Analysis of Failure Data on Components. W. S. Connor anp Witi1amM T. Wewts, Research Triangle Institute. 


This paper contains a study of failure data from the manufacturing and testing process of a missile. The data 
are for three components of the guidance set. The units of the guidance set are submitted to a battery of mechanical 
and electrical tests as the final stage of the manufacturing process, and are tested from time to time thereafter. 
When a unit fails, it is repaired, so that a single unit may late any ber of failures. The tests are not life 
tests in the sense that a unit is tested continuously until it fails. Instead, the unit is under test only a small fraction 
of the time. It is found that the number of failures per unit which occur during the manufacturing check-out is 
distributed according to the geometric distribution, and that the number of failures per unit after shipment is dis- 
tributed according to the Poisson distribution. For each component, the parameter in the Poisson distribution was 
estimated using the first two months of data, and predictions were made of the number of failures in subsequent 
months. Very good agreement was found between predicted and observed numbers of failures. 

The production and testing process is described and an expianation is given of how the failure data were tabu- 
lated into cumulative monthly frequency distributions. 

Because there appeared to be a smooth change in the failure distributions from month to month, the next 
effort was directed towards bringing time into the mathematical models. To do this successfully required retabulation 
of the data to separate the failures which occurred immediately after production from those which occurred after 
the units had been shipped. 

In order to obtain independent observations, the latter failures were tabulated in yet a different way. Transi- 
tion numbers of units were found. These numbers are found in the following way. When t—1 months have passed 
since manufacture, there are nj(¢—1) units which have failed i times. Of these units, n;,(t—1, f) incur (k —i) addi- 
tional failures during the ‘th month. This is a transition number. 

A description of how the geometric and Poisson distributions were fitted to the data is given and the method 
of making predictions to subsequent periods is illustrated. 


An Analysis of the Accumulated Error in a Hierarchy of Calibrations. Epwiy L. Crow, National Bureau of Standards. 


Calibrations of many types are performed in a hierarchy of calibration laboratories fanning out from a national 
standard. Often che statement is made that the accuracy of each echelon of the hierarchy should be 10 times the 
accuracy of the immediately following echelon. The validity of such statements is examined by stating formulas 
for the total error accumulated over the entire sequence when systematic and random errors may occur in each 
echelon, and by determining how a given total error may be achieved at minimum total cost under a reasonable 
assumption for the form of the cost-error relations. It is assumed that the cost of achieving a given error varies as 
some negative power of the error. Let —a be the exponent of the error in the cost-error relation. As an example of 
the results, if the errors of all echelons are uncorrelated, tlien the optimum ratio between the standard deviations 
in two successive echelons is equal to the (a +2)th root of the number of laboratories in the lower echelon reporting 
to each laboratory in the higher echelon. This optimum ratio frequently turns out to be between 2 and 4. 





Project TALENT—Progress Report. Joun T. Damtey, University of Pittsburgh. 


Approximately 30 psychological, educational and background measures were administered to a five per cent 
stratified sample of 450,000 high school students in 1357 schools of the United States. The data will be analyzed 
and the students followed up to relate the data to a wide range of later behaviors such as going to college, entering 
the Armed Forces, or becoming scientists or teachers. Follow-up questionnaires will be sent to each subject to 
obtain information regarding educational and vocational choices, degree of success and satisfaction in school courses, 
work experience, description of activities engaged in, and description of the subject’s perception of himself. Follow- 
ups will be one, five, ten, and twenty years after high school graduation. 

The total sample will be sorted, edited, and processed by the University of Pittsburgh’s IBM 7070 Computer. 
The computer will edit and analyze a sample of 45,000 for certain early studies during the fall of 1960. The mas- 
ter tape for the full sample should be ready for analysis early in 1961. 

Some preliminary results from a four per cent sub-sample will be presented. These will include test intercorrela- 
tions and some of the interrelationships of the test scores and several key background factors. 


Details of the Housing Tabulations. Wayne F. Daveuertr, Bureau of the Census. 


The 1960 Housing Census tract program has been changed and pared with the 1950 program. First, the 
basic unit of enumeration has been shifted from the dwelling unit to the housing unit, permitting recognition of 
kinds of private living accommodations which were not enumerated in the 1950 Census of Housing. These units 
tend to be concentrated in a limited number of tracts in the largest cities, which may result in an increase in the 
number of units reported in these tracts, even though no actual increase in housing may have occurred during 
the decade. 

gs d, new subjects will be covered in the tract tables and characteristics previously provided will be ex- 


panded to show more detail. 
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Third, there has been an increase in the nonwhite tract statistics program. The program will cover a larger 
number of tracts than in 1950 and additional information will be shown for tracts having high concentrations of 
housing units occupied by nonwhites. ; 

Finally, an innovation for 1960 will provide data in the five southwest states for tracts having 400 or more 
units occupied by white household heads with Spanish surname. 


The 1960 Housing Inventory—Concepts, Coverage, and Trends. Warne F. Dauauerty, Bureau of the Census. 


Concepts are viewed in terms of the revolutionary method used for collecting census data. The impact of self- 
enumeration on concepts and definitions is evaluated. Specific attention is given to the change in unit of enumera- 
tion—from dwelling unit to housing unit—and to the concept of housing “condition.” The developments in these 
two concepts are outlined, highlighting the problems and their solutions. 

The extensive application of sampling in the Census has resulted in the utilization of three levels of sampling — 
5, 20, and 25 per cent. The choice of sample size for a given housing item was made on the basis of (1) the desired 
reporting level—whether block, tract, city, county, or larger area; (2) the expected frequency of occurrence; and 
(3) the analytical value of the item—whether necessary or not for cross tabulations. The rationale for the choice 
of sample size for each subject is provided. 

Trends in the housing supply at the county level are presented in detail for selected areas. Local housing 
trends frequently differ considerably from population changes. Therefore, the local trends are described in relation 
to the national housing pattern. 


Court and Probation Statistics. Geores F. Davis, California Bureau of Criminal Statistics. 


This paper contains a discussion of the past history and present status of criminal court and adult probation 
statistics in the United States and in California in particular. The main focus, however, is on the development of 
court and probation statistics within the framework of the California Bureau of Criminal Statistics and the Uniform 
Criminal Statisties Act. Methods of collecting the source material are explained, the statistical document cards 
used in gathering these data are displayed, and the basic rules of classification are outlined. Examples are also 
given of the uses of the statistical data that have been gathered in California, with the hope that these data will 
provide the necessary stimulus to other states to develop their own statistical programs. References are made to 
publications in this field that can be obtained through the California Department of Justice, Bureau of Criminal 
Statistics. 


Implications of Prospective United States Population Growth in the 1960’s. Joserm S. Davis, Stanford University. 


The United States is entering the third decade of a profoundly significant demographic revolution. 

The 1940's saw an unprecedented rise in the prevalence of the marriage state and two short-lived “baby 
booms.” As births surprisingly flooded to a new high level in 1956-59, our unexpectedly vigorous population in- 
crease was remarkably sustained through the 1950's. Numerical gains in the 1960's will probably exceed those in 
the 1950's. In terms of needs, wants, and productive capacity, the effective population increase will be larger. 

Barring catastrophic destruction, the most important of several demographic developments in the 1960's 
will be the growing older of most of those born since 1940, as the past curve of births is echoed in successive age 
groups. A marked swelling of the teen-age group has begun. In 1964-65 a sharp rise in the number of 18-year-olds 
will inaugurate major increases in highly significant age groups. These will multiply educational requirements, 
necessitate larger public and private investments, and severely tax our ingenuity to meet exceptional challenges. 

On balance, the prospective population developments promise to stimulate consumption and investment, 
promote economic growth and stability, and enlarge our ability to meet the Communist threat, while permitting 
further gains in our levels of living. 


Multiple Comparisons Among Means. Oxtve Jean Dunn, University of California, Los Angeles. 


There has been considerable work done on the problem of finding simultaneous confidence intervals for a 
number of linear contrasts among several means for normally distributed variables. Scheffé and Tukey each give 
a method for constructing simultaneous confidence intervals for all possible linear contrast among & means using 
the F distribution and the distribution of the Studentized range, respectively. Each of these methods may be ex- 
tended to give confidence intervals for all possible linear combinations of the k means, as opposed to linear contrasts 
only. 

In this paper the possibility is considered of picking in advance a number (say m) of linear combinations among 
the k means, and then estimating these m linear combinatiors by confidence intervals based on a Student ¢ statistic, 
so that the overall level for the m intervals is greater than or equal to a preassigned value, 1 —a, For some values of 
k, and for m not too large, intervals obtained in this way may be shorter in some sense than those using the F dis- 
tribution or the Studentized range. When this is so, the experimenter may be willing to select the linear combination 
in advance which he wishes to estimate in order to have m shorter intervals instead of an infinite number of longer 
intervals. 


Estimating the Kill of Game Animals by Licensed Hunters. Lee Esernarpr, Michigan Department of Conservation. 


Michigan game kill estimates are derived from mail surveys of licensed hunters. Systematic samples (with 
random starts) are taken annually from file copies of current hunting licenses. Several different kinds of licenses 
are sold, but the aggregate is well over one million licenses. 

A number of different mail surveys are conducted annually. Double-return post cards are used in all cases, 
and up to four reminder mailings are used to insure a high response rate (letters are used for the last two reminders). 
The largest single sample taken annually includes about 10,000 deer-license buyers. This particular survey has now 
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been conducted for eight successive years and has consistently yielded over-all response rates in excess of 90 per cent, 
with rates between 94 and 95 per cent in the last five years. In all eight years, the response rate for delivered cards 
has been virtually constant, at 96 to 97 per cent. 

Samples of about 6,000 individuals drawn from a sub-population of deer-licensees have yielded response rates 
as high as 99.4 per cent over all. 

Some studies have been made of the potential effects of non-response, checks of validity of survey results, and 
survey costs. 


Optimal Designs to Estimate the Parameters of Variance Components Models; Two-Way Classification. D. W. 
GayLor anv R. L. ANpEeRson, North Carolina State College. 


Methods of sampling are considered in order to obtain good estimates of components of variance in the two- 
way model 
Vijk = B +E + ej + (re)ig + Cage, 


where the effects are normally and independently distributed random variables with zero means and variances 
er’, o-3, Ore? and a’. 

It was shown that the lower bound for the variance of an unbiased quadratic estimator of a linear function 
of components of variance with expected value, o~*, is 2e4/(N —1) where N is the total number of observations. 
Procedures which achieved the lower bound were obtained for estimating 


Ge, (oe? + ore? + a7), (oe? + or +c), and (a6? + ofc? + 0;* + oc). 


A procedure was developed for minimizing the variance of an estimator of ¢,? or o¢? for a design with njj =0 or n, 
where nj is the number of observations in the (1, j) cell. 

A procedure was developed for minimizing the variance of an estimator of o,2/(o@+-orc*) or o¢?/(oe?+-orc*) 
where n;j =0 or 1. 

A few tentative conclusions on the simultaneous estimation of o,? and ¢,? were obtained. Two types of designs 
were compared. 


Industrial Production in Current Analysis. CLayron Gruman, Board of Governors of the Federal Reserve System. 


Industrial production indexes measure the real performance of the economy and reveal demand and supply 
developments. Business equipment production fluctuates around output of co goods infl din part by 
the breadth and rate of changes in consumer goods. The present expansion interval in consumer goods is still under 
two and one-half years in length and has shown more sustained growth than in the earlier postwar recoveries. Indus- 
trial production measures when compared with deflated expenditure series show divergent changes in the ecoonomy— 
some real and some statistical. The past two years has been an interval of one of the largest differences which have 
been difficult to reconcile with existing information. 





Labor Force Projections for California. Maurice I. GersHenson, California Department of Industrial Relations. 


In some respects, the California labor force will parallel the national projections for the next decade, but there 
will be many significant differences. 

California's labor force will increase more than twice as fast as the U. S. Labor force participation rates will be 
similar to those for the nation, except for the younger age groups. 

The rate of increase among young persons under 25 years will be greater in California. In the age group 35-44, 
the State will have an increase of nearly 25 per cent as against a decrease for the U. 8. 

As in the U. S., women will constitute an increasing proportion of the labor force. 

Occupational trends wil] be similar, except that a relatively greater demand for professional and technical 
workers is expected in California. 

A quarter of a million new jobs per year must be provided in California during the decade of the sixties, if 
unemployment is to be kept low. Unlike the projected industry trends for the nation, largest employment increase 
in California is expected in manufacturing. 

The California labor force projections are part of a set of comprehensive projections of various socio-economic 
aspects of the State undertaken by members of the Governor's Interdepartmental Research Coordinating Com- 
mittee, 


Survey Sampling and Implementation for Development Programs. Roe Goopman, Bureau of the Census. 


It is contended that for successful implementation of sample surveys in underdeveloped countries major em- 
phasis must be placed upon the scientific character of surveys. The stressing of the scientific aspects of such under- 
takings is necessary in order to provide a sense of participation in a “prestige” activity and hence an adequate 
motivation for the professional survey staff. By this means the trained survey workers can be encouraged to rise 
above the discouragement which they may otherwise feel due to unsatisfactory administration. As for sampling, 
the emphasis on science requires the use of probability sampling designs which are thoroughly practicable and hence 
no question of sampling efficiency would be permitted to jeopardize the proper implementation of the sample. 
Examples of justifiable inefficiencies are given. 


The Use of the Industrial Production Index in Business Planning. Doveias Greenwapd, McGraw- Hill Publishing 
Company. 


Business forecc iting is only a very early step in business planning. Before the planner can make a decision 
he needs the forecast, and before the forecast can be made, the statistician needs the tools to make it. Among his 
kit of statistical tools is the industrial production index, 
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How business plans are organized and implemented depends upon company size. The availability of funds for 
statistical research is the deciding factor on how the forecasts are made. Investigations are carried out by statis- 
ticians and’ market researchers. This is the stage of business planning where the industrial production index is used. 
The goal of good business planning is to achieve maximum profits and minimum risk so only a few investigations 
prove fruitful. 

The channels of communication between the researcher and the planner also depend upon the size of the com- 
pany. This is the area where the technical statistical language needs simplification. 

Seven case studies are used to demonstrate the usefulness of the industrial production index in business fore- 
casting and, in turn, in business planning. The planning problem, the forecasting technique, and the implementation 
of the business plan for each of the seven cases is discussed briefly. 





Current Problems in Police Statistics. Joun I. Grurrin, The City College of New York. 


The present status of statistical work in major police departments suggests that it is not used as a useful tool 
by police commanders but is regarded as a routine reporting chore. A series of institutes in the larger cities is pro- 
posed as a means to widen the understanding by police of the contributions which statistical methods may make to 
the efficiency of police agencies. The skeptical attitudes toward police statistics on the part of many law enforcement 
officers is due in a large measure to the use of these data as a means of judging the officer's own efficiency. It is pro- 
posed that statistical reporting be separated from disciplinary procedures, otherwise the “squeal book” will be 
closed. 

Major areas in police statistics which are most promising include: effective presentation of data to the public 
as a means for mobilizing support; development of sample survey procedures in order to provide police commanders 
with necessary data on police hazards; correlation of small area data and police crime complaints in the development 
of defensible schemes for patrol force allocation; and the introduction of electronic data processing, in particular the 
use of file computers with centralized files permitting rapid search. The acceptance of modern statistical methods 
within police agencies is the prerequisite for better reports to state and national bureaus. 


The Statistics Curriculum in the Age of Electronic Computers. Mmtarp Hastay, Washington State University. 


Electronic computers, in a single decade of development, have effected a revolution in scientific research. 
This revolution is due to their unprecedented speed, their unparalleled accuracy, the invention of automatic pro- 
gramming, and the development of libraries of general-purpose programs. In statistics their chief contribution 
lies in two areas: (1) where massive amounts of basic data are to be processed; (2) where calculations are too nu- 
merous and complex to be carried out by mechanical methods. The expansion of the universe of possibilities in both 
directions is so spectacular as to confront the statistician with the need for major temperamental adjustments. 
He must learn to “think big” where time and resources have hitherto dictated conservatism; he must reduce his 
emphasis on solutions in “closed form” in the interests of exploiting the machine to find solutions by numerical 
experiment; and he must develop a unified view of statistics adapted to machine potentialities. 

Modern computers are peculiarly suited to statistical processes of two types: general methods of searching for 
optimal procedures in the face of uncertain alternatives, and taxonomic methods of investigating what alternatives 
exist. The decision-theoretic approach to statistics appears to provide the most suitable analytical framework 
around which to organize the required reconstruction of ideas. The outcome should be a revived interest in the 
principles of systematic discovery, of hypothesis-seeking in contrast to hypothesis-testing. 


Improved Factual Tools for Measuring the Styling Trends in the 60’s. Mrnon J. Heircorr, Lippincott and Mar- 
gulies, Inc. 


Two things are necessary for a good understanding of design preference, or style trends. They are good con 
cepts and good measurements. 

Since the determinants of preference are neither rational nor conscious, we have been able to learn more about 
design preference from measuring the consumers’ behavior than from talking to him about it. 

But even this approach has its problems, since his behavior is subject to many contaminating influences. We 
have to find ways of minimizing these influences, or if we cannot, to measure the nature and extent of the error they 
introduce. 

Perhaps, in addition to refining our measurements, we should be more concerned with the concepts we are 
measuring, in an attempt to develop a theory of design references. If we could improve our understanding of the 
causes of design choice, our ability to measure it would be much improved. 


Statistical Education of Non-Statisticians in Industry and Medicine. Waiter E. Hoapiey, Armstrong Cork Com- 
pany. 


Success in improving the statistical education of non-statisticians in business depends greatly upon (1) the 
degree of recognition of the need for more statistical training, (2) the working relationships between statisticians 
and non-statisticians, (3) the “practical” emphasis of the statistical training program, and (4) the incentives for all 
concerned to further statistical education. 

The key objectives of any business statistical education program must be to: (1) broaden the statistical under- 
standing of non-statisticians so they can make a greater contribution directly to the profitable growth of the com- 
pany; and (2) improve the general acceptance of statistical methodology across the company so professional statis- 
ticians can enlarge their own contributions and hence their own status. 

Most business organizations, and especially the largest, favor continuing educational programs for employees. 
Hence, statistics must be included officially in the general curriculum, whether the study is to be undertaken within 
or outside the company. Obviously, this means management must be “sold” on the value of statistical training, 
and professional statisticians must accept the principal responsibility for this task, 
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The actual course of instruction must not only be sufficiently comprehensive to give non-statisticians a general 
knowledge of the field but also to permit them to make applications to problems at hand. 


Measuring Effect of Lettuce Advertising by Multiple Regression. H. S. Hournaxxer, Stanford University, anD 
Frank Meissner, Stanford Research Institute. 


In the period 1952-57 Lettuce, Inc.—a group of head lettuce growers and shippers from Salinas, California— 
carried on a sustained, national 8-media campaign promoting the C-7 brand of lettuce (C for California, 7 for seven 
days a week). In 1956, as part of a broad “impact survey” an attempt was made to quantitatively measure the 
association between the promotion “intensity” and per capita consumption of lettuce in 22 U. 8. cities. A multiple 
regression analysis indicated that the weighted over-all net effect of the C-7 program was equivalent to an annual 
increase in demand for head lettuce of 8.8 carlots per 100,C00 population. Since Salinas accounted for about one-half 
of total U. 8. head lettuce shipments, the area would presumably be able to supply at least balf of this increase. 

These modest results indicate that, under certain circumstances, the multiple regression can: (1) be a tool for 
quantitative evaluation of the general demand shifting attributes of sales promotion and advertising, and (2) help 
to estimate the sales producing outputs of advertising inputs of an individual firm. 

In order to express the input-output relationships in dollar terms, the financial data on promotion and advertis- 
ing would have to be made available by individual marketing areas. This would be no extra burden on accountants 
of advertising agencies, provided that the statistical evaluation, and thus the data collection, is made an integral 
part of the over-all plan of action for the campaign. 


The Problem of Equivalence of Meaning in Cross Cultural Research. Braprorp B. Hupson, Rice University. 


The observations described in the following are based upon a program of research in five Arab countries in the 
Middle East and the United States, a comparative social psychological study of young adults in secondary schools 
and colleges together with small samples in the Middle East of illiterates. The combined sample is approximately 
4,500. 

One of the central problems in cross-cultural research is establishing the equivalence of meaning of questions 
and responses in questionnaires and interviews. The methods at this stage of development provide only estimates 
of equivalents. Two approaches to the problem are discussed: pre-administration checks on translations and post- 
test estimates of equivalents. In the area of personality and related variables, the latter are provided by the cross- 
cultural correspondence (1) of interscale relationships or correlation patterns, and (2) of response patterns to indi- 
vidual questions by groups with given trait characteristics. 

Systematic cross-cultural deviations in patterns of correlations may reflect either differences in meaning or 
indicate cultural differences. The authoritarian in the Middle East, for example, though similar in many respects to 
his U. 8. counterpart, tends to be somewhat better adjusted in terms of the measures employed, reflecting, it is 
inferred, the rewards associated with cultural conformity. 

The data in these studies imply that although average scores may differ substantially cross-culturally, inter- 
trait relationships are similar, the self-accepting or the anxious individual, for example, being adjusted or anxious 
for the same classes of reasons regardless of culture. These uniformities in relaticnships may ultimately provide a 
common baseline for reliable cross-cultural comparisons, reflecting a conclusion enunciated by Whiting and Child 
that “there are some principles of personality development which hold true for mankind in general and not just for 
Western cultures.” 


Training the Undergraduate in the Use of the Computer with Special Reference to Students of Statistics. W. C. 
Jacos anv A. H. Taus, University cf Illinois. 


It is our belief that the training of all undergraduate students in the use of computers should begin in the 
same way and that this initial training should familiarize the student with general properties of computers, how they 
may be used effectively on a class of problems, and the detailed properties of one or more specific computers. We feel 
that “short courses” of two to six weeks’ duration which teach one how to use a “programming cystem” for a par- 
ticular computer are completely inadequate. We shall first discuss the contents of what we believe to be an adequate 
first course in computers and then discuss the contents of additional courses designed especially for advanced under- 
graduate or beginning graduate students whose principal interest is in statistics. 

The first, introductory, course in computers should include a discussion of the following topics: (a) Arithmetic 
Systems, (b) Organization of Computers, (c) Machine Codes, (d) Machine Arithmetic, (e) Problem Formulation 
and Organization, (f) Use of Subroutines, (g) Diagnostic Routines, (h) Deseription of Various Input and Inter- 
pretive Routines. 

The proposed second course for students of statistics is meant for advanced undergraduate and beginning 
graduate students. This course should include the following topics: (a) Introduction and Review, (b) Preliminary 
Data Handling, (c) Organization of Statistical Problems for Computers, (d) Solution of Problems associated with 
Specific Applications. 


Needed Basic Steps in More Effective Use of Statistical Methods in Audit Sampling. Ronert W. Jounson, Touche, 
Ross, Bailey & Smart. 


The hopes that were held out ten years ago for applying statistical sampling techniques in the independent 
audit of financial statements have not been fulfilled. The lack of progress appears to derive principally from the 
difficulty auditors have in stating the purposes of any test specifically and unequivocally, and from the heretofore 
limited extent to which statisticians and auditors have jcintly attacked audit questions. 

This lack of definitiveness arises in part out of the auditor’s knowledge that information gleaned in one area 
of an audit may quite unexpectedly lead to important findings in other areas. In part, however, such imprecision 
stems from the absence of satisfactory models of busi information systems. It is suggested that more extensive 
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use of statistical sampling by auditors may have to await research which will produce better understanding of busi- 
ness systems and of the purposes of independent audit. Suggestions are proffered for research in this area. 


Countercurrent Dialysis—A Stochastic Process. Marvin A. Kastensaum, Oak Ridge National Laboratories. 


Peptides isolated from rat liver microsomes may be separated from contaminating amino acids, sugars and 
salts by countercurrent dialysis. A single such dialysis can achieve only fractional purification. Therefore, more 
complex systems of multiple dialysis, in which a linear series of cells are used, have been devised. The system of 
immediate interest may be described as follows: Let each dialysis cell be a stage, and each dialysis period be a 
cycle. Then after each cycle, the dialout (the solution outside the dialysis sac) at each stage is concentrated and 
used as the dialin (the solution inside the dialysis sac) in the succeeding stage, whereas the dialin at each stage 
is concentrated and returned to the dialysis sac of the previous stage. The dialin at the first stage is retained at 
the first stage, and the dialout at the final stage is taken out of the system. In the nomenclature of stochastic proc- 
esses, the first stage represents a reflecting barrier and the last stage an absorbing barrier. 

To determine the probability with which a particle of the isolate will be at a specified stage in the system after 
a given number of cycles have been carried out, the system is described in terme of a Markov process with a stochastic 
matrix resulting from the product of two matrices, namely, a diffusion matrix and a transfer matrix, and alge- 
braically explicit solutions are derived for any number of stages and cycles. 


The Function of Expectational and Motivational Data. Geornce Katona AnD Eva MvuELLER, University of Michigan. 


Based on the hypotheses that certain forms of demand are a function of ability to buy and of willingness to 
buy, and that willingness to buy may change independently from ability to buy, the Survey Research Center 
has collected measures of willingness to buy over the past ten years. Intentions to buy data, collected for fifteen 
years, are viewed as a useful supplementation of data reflecting shifts in consumer sentiment. Time series correlation 
calculated over seventeen periods between the Center's Index of Consumer Attitudes and Inclinations to Buy and 
Commerce Department data on durable goods sales during the subsequent two quarters indicates that the ex- 
pectational data have substantial predictive value. Time series correlation should be supplemented by a special 
study of turning points. The Survey Research Center data on consumer expectations made their most important 
contribution to forecasting in the summer of 1954 and again in the summer of 1957. Expectational and motiva- 
tional data also shed light on the factors responsible for changes in demand and on longer range trends in consumer 
behavior. 


Capital Investment: Plans and Performance. Dexter M. Keezer anp Marcarer K. Marvuis, McGraw- Hil 
Publishing Company. 


The annual surveys of capital investment plans, conducted by the government and by McGraw-Hill, have 
proved to be a reliable guide to both the direction and volume of investment in new plants and equipment in the 
year immediately ahead. In the last eight years the degree of change indicated by the McGraw-Hill surveys has 
averaged about 4 per cent from the actual change that occurred. 

The McGraw-Hill Fall survey, a check-up on business’ plans for capital investment one year ahead, although 
not as precise a gauge of the volume of investment because companies’ plans are necessarily preliminary at the time 
the survey is taken, has been a good indication of the direction capital expenditures would take. In November 
1954 it provided the first broad indication that capital investment would turn up the following year. 

Combined with the purposes of investment as reported in the McGraw-Hill surveys, the quantitative figures 
on capital investment provide important tools to business forecasters and analysts. Key forces underlying invest- 
ment plans are illuminated by the surveys. The McGraw-Hill index of manufacturing capacity is the only available 
direct measure of actual and prospective increases in capacity. This index combined with the data on manufacturers’ 
operating ratee and preferred levels of operation has provided reliable indications of prospective shifts either to or 
away from expenditures for expansion. 

Figures on research and development expenditures, plans and performance, provide another key force under- 
lying the character and magnitude of investment in the coming years. 

Depreciation allowances and companies’ policies for using them, as reported in these surveys, are another 
important foree shaping the volume of capital investment in the years ahead. 


Ford Motor Company Training Program in Engineering Uses of Statistics. Cauvin J. Kincuen, Ford Motor 


Company. 


“Challenge and stimulate” is the criterion for inclusion of material in Ford Motor Co.'s training program in 
engineering uses of statistics. The basic ideas of hypothesis testing provide the first goal to be presented, by way 
of the binomial distribution, operating characteristic of an inspection plan, Type I and Type II errors and their 
risks. Thus, the statistical contribution to experimental work is indicated as quickly as possible. In contrast, the 
traditional approach through teaching calculation of the mean and the standard deviation emphasizes dull, tedious 
material and dampens interest rapidly. But given a taste of hypothesis testing, the student sees calculation of mean 
and standard deviation as tools in their proper perspective, and not as mountain peaks achieved at the expense of 
“blood, sweat and tears”. Association of variables involved in various studies leads easily to discussion of straight- 
line regression, norma! equations for algebraic polynomials; the spirit of Dr. Snedecor's treatment of linear regression 
is very important to follow here. Analysis of variance in the simplest designs but developed to display progressive 
reduction in error sum of squares is now presented; a quick review of linear regression provides a means of intro- 
ducing covariance to reduce error sum of squares further. Basic ideas of two-level factorial experiments and of 
fractional factorials as expressed by Davies and Hay in their 1950 paper in Biometrics, and the use of the NBS 
catalog of fractional factorials complete the course as presented so far. 
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A Multistage Probability Sample for Traffic Surveys. Lestre Kisu, Surrey Research Center, WARREN LOVEJOY 
AND Paut Racxow, Port of New York Authority. 


The chief purpose of the survey is to make origin and destination estimates of the ninety million automobile 
and truck crossings per year between New Jersey and New York across the six vehicular facilities of The Port of 
New York Authority. A sample survey has been in successful operation now for over two years, taking about 90,000 
brief interviews with sample drivers in each year. About 40 interviews are taken per hour. An interesting feature 
is the design which samples all time segments throughout the year and all traffic lanes. The selection of time periods 
is balanced for all traffic directions so that an estimate for either a specific time period or for a specific direction is 
balanced out against the other. Yearly estimates are computed together with their sampling errors. For important 
segments seasonal estimates are also available. This continuous traffic survey sample replaces the previous practice 
of taking some 200,000 interviews on a “typical” weekday, Saturday and Sunday with a suddenly hired labor force 
of about 300 interviewers. Another interesting feature of the design is the way in which the sample design balances 
the resources against the strict requirements of the circumstance. These involve reasonable working loads and work 
periods while maintaining a model of probability selection. 


Two Studies of Interviewer Variance of Socio-Psychological Variables. Leste Kiso anp Carnot W. SLaTer, 
University of Michigan. 5 


In the First Study a sample of 462 workers of a factory was randomized among 20 interviewers and 46 items 
investigated, mostly open-ended questions on attitudes toward work, plant and union. In the Second Study 489 
workers of another factory were randomized among nine experienced interviewers and we investigated 25 
items from an oral interview and 23 from a written questionnaire. In each study we measured the effects of the 
variable bias of the interviewers as a component of the total variance per respondent; thus s* =s,?+-s,?, where #4? 
is the “among interviewer” and sj? the “within interviewer” component and roh =s,?/s? is the ratio of the inter- 
viewer effect. Our conclusions: (1) We can obtain responses with low or moderate interviewer variance on highly 
“ambiguous” and “critical” attitudinal questions. (2) The range of roh’s was mostly 0 to .07 in the First Study and 
0 to .04 in the Second and practically zero in the questionnaires. These effects are not generally higher than for 
must factual items in a good Census! (3) These seemingly small roh’s, combined with moderate workloads, still 
increase the variance by factors as high as two or three, because the ratio of increase is [1-++-roh(% —1) ] where n is 
the average workload per interviewer. (4) The analysis can point to items with unduly large variances and call for 
corrective measures. (5) Our attempts to distinguish different average interviewer effects for several classes of 
variables—ambiguous, critical and factual items—failed. (6) The interviewer effects on subclass means were shown 
to be smaller in accord with [1+roh(n*-1)] where n* is average number of subclass members per interviewer. 
(7) In the comparison of subclass means, the interviewer effects tend to zero. 


On Theoretical Questions Underlying the Technique of Replicated or of Interpenetrating Samples. J. C. Koop, 
North Carclina State College. 


An important development in statistical methodology is the technique of interpenetrating samples, introduced 
by Mahalanobis in 1939, It is one of the techniques recommended by the United Nations Subcommission on Statis- 
tical Sampling, and there has been much discussion on its use in the study of response (or ascertainment) e:rors 
and in the comparison of the results by different investigators or teams of investigators. However, the theoretical 
basis of the whole subject, so far, has not been fully explored and it appears that its use has not been extended much 
beyond India partly for this reason. 

In this paper it is shown that when ind dent replicated (or interpenetrating) samples are used the formula 





for the calculation of estimates of variances of the estimates of population values (however complex the sample 
design and whatever the functional form of the estimating formula, e.g., ratios) reduces to the well-known classical 
formula for the variance of the estimated mean of an infinite population. 

In the case of linear estimates the efficiency of the bined set of replicated samples is shown to be equal 
to that of a single equivalent sample when: 

(i) the samples are drawn from an infinite population, 

(ii) the first stage units are drawn with equal or unequal probabilities, but with replacement in the first stage, 

in the case of multi-stage sampling uf a finite population. 

Further for single or multi-stage samples from a finite population where each unit is drawn with equal prob- 
abilities and without replacement at each stage, the efficiency of the combined set of interpenetrating samples is 
less and depends entirely on the first stage sampling fraction. It is not very much less efficient if the first stage 
sampling fraction is small. Further for the same equivalent sample size efficiency decreases as the number of replicated 
samples increase. This demonstration throws into relief the fact that it is best to take only two replicated samples 
despite the fact that there is only one so-called “degree of freedom” to estimate the sampling variance of the mean 
of the two estimates derived from each of the veplicates. 

Finally, applications in which there is much theoretical uncertainty about the use of the method are discussed. 
The appropriateness of the theory for use in the estimation of pling errors of (i) price index numbers, (ii) esti- 
mates whose underlying functions have variances which are intractable or unknown, is demonstrated. 








The Future Outlook for the Recruiting of Statisticians. Cart F. Kossack, IBM Research, 


The problem is discussed from the following major problem areas: (1) the lack of status of statistics, (2) the lack 
of preparation of students, (3) curriculum problems, and (4) professional organization needs. In the status area the 
“tabulation” type of activity is considered as a source of the difficulty. The preparation of student difficulty is 
associated with the mathematical background deficiency of students and their need for an “applied” field. Cur- 
rieulum problems are associated with the current conflict between the applied and theoretical schools in statistics. 
The need for a professional society is suggested. Several courses for immediate action are generally given by the ASA, 
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Scope and Impact of the 1960 World Population Census Programs. Max Lacrorx, Statistical Office of the United 
Nations. 


The 1960 World Population Census Programme (given its initial impetus by the United Nations Statistical 
Commission and Statistical Office, with important contributions from the conferences of statisticians in Africa, 
in Asia and the Far East and in Europe and from the Inter American Statistical Institute) was aimed primarily at 
bringing about more and better national population censuses and secondarily at obtaining more internationally 
comparable results. 

The United Nations Statistical Office prepared a set of Principles and Recommendations for National Pepulation 
Censuses (to guide the countries) and a three-volume Handbook of Population Census Methods (to facilitate the im- 
plementation of the Principles. In addition to these basic Principles, several adaptations have evolved, geared to 
the special problems of census-taking in particular regions, taking account of the degree of development of the area, 
the resources in staff and equipment, and other factors. These adaptations were prepared by Working Groups of 
the UN Regional Conferences of Statisticians, and the Committee on the Improvement of National Statistics (IASI). 

An important aspect of the Programme was the training of key census personnel in various regions of the 
world—especially in Latin America, in Asia and the Far East and in the Middle East. In 1955, censuses were dis- 
cussed in detail at a statistical seminar for Arab States held in Cairo; in 1958, training centres were held in Santiago 
and Tokyo. 

Population censuses provide but a segment of the data on the economies of highly industrialized countries 
(the latter having other sources of information, such as censuses of industry and distribution, special surveys, non- 
governmental data). On the other hand, in the less-developed countries, population census data are vitally im- 
portant for mirroring the existing economic conditions. 


Studies of Validity in Reporting Financial Data. Joun B. Lanstne, University of Michigan. 


This paper represents a summary of the findings of an investigation of response error in reports of economic 
information. This investigation was undertaken at the Survey Research Center of the University of Michigan as 
part of a larger project carried out by the Inter-University Committee for Research on Consumer Behavior. 

Before this investigation began there was evidence that the response error in reports of some types of financial 
information is greater than for other items. Reports of income, for example, are relatively accurate, while reports 
of cash loans and savings accounts seem to be relatively inaccurate. This investigation focussed on the inaccurate 
items. 

The procedure used was to obtain a list of individuals about whom some item of financial information was 
known or could be ascertained. The individuals were then interviewed as if the information were unknown. Several 
field studies of this type were completed involving about 100 interviews apiece. Comparisons of the responses with 
the information available about the individuals did tend to confirm the existence of large response errors. 

Investigation of the correlations between the size of the errors and various socioeconomic characteristics of the 
individuals indicated that the errors tend to be larger for some segments of the population than for others. 

In an attempt to reduce the observed errors experimental manipulations of data collecting procedures were 
introduced. Moderate improvements were obtained. The investigators, however, became convinced of the im- 
portance of developing a theory of response error. In the absence of such a theory the number of possible techniques 
which can be suggested for experimentation is very large and only hunches and common sense can guide the selection 
among the possibilities. As the work progressed, therefore, the investigators devoted an increasing share of their 
efforts to an attempt to develop a theory of response error. 


Stochastic Analysis as Applied to the Multiple Linear Regression Equation. Dick A. Leaso, The University of 
Michigan. 


Objectives 

This paper has a dual purpose: (1) to measure which source of income has the greatest effect in the determination 
of the final level of personal income in a state (Michigan data for the years 1931-41 and 1946-58 are used as an 
example), and (2) to demonstrate the use of the technique of stochastic analysis to modify the multiple linear 
regression equation. As the title indicates, the second objective is the real basis for the paper with the first purpose 
being an important by-product. 


Methodology 


In order to accomplish the first objective a multiple linvar correlation was run to determine the stability of 
the eleven major sources of Michigan's personal income. The sensitivity of the State’s income is measured by the 
relationship of each of these major sources to Gross National Product utilizing simple linear correlation in each case. 

The second objective, demonstrating the use of stochastic analysis to modify the multiple regression equation, 
is achieved by following several brief steps: 

(1) Estimate random disturbance (W;,) for each of eleven major sources of Michigan personal income. 

(a) Determine range of deviations of predicted values (Xo from actual Xo) for each source of income, 
utilizing the simple linear correlations. 
(b) Set up a frequency distribution of these deviations: 
(1) use i =range/1 +3.322 log N 
to determine class interval of deviations. 
(2) assign random numbers to each class. 
(3) using a random numbers table, select one class and use midpoint to represent the random disturb- 
ance (Wy) for each source of income. 
(2) Estimate Michigan's personal income by substituting in the modified multiple linear regression equation. 


Xo = a(biXi + We.) H (beXa t We.) +++ + (buXu + We,,). 
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How to Measure Juvenile Delinquency. Perer P. Lrsivs, University of Maryland. 


The subject of this paper is interpreted as issues involved in the development of statistics of juvenile 
delinquency for the purposes of measuring the over-all amount and the specific kinds of juvenile delinquency in 
the various communities in order to compare these with one another and one and the same community with itself 
over a period of time. Essential to the development of 3uch statistics is the uniformity of the definitions of the 
categories of facts to be recorded and of the procedures involved. The extreme weakness of the field of juvenile 
delinquency in this respect is obvious. The basic concept, viz., that of juvenile delinquency itself, is discussed, and 
the “formal” as against the “descriptive” or “content” definition given preference. The fact that the statistical data 
on juvenile delinquency are a function of at least two factors: 1. the behavior of the juvenile within the setting of 
his community and 2, the policy of the law-enforcement agencies (special agencies), is brought out. This means 
that contrary to the usual practice the changes and differences in the juvenile delinquency statistics should not be 
interpreted as changes and differences in the behavior of the juveniles before the potential role of the other factor, 
viz., that of the changes in the policies of the special agencies, is d. The policies of the law-enforcement 
agencies are further analyzed as consisting of policies in the true sense of the word and of policies with regard to 
recording procedures. The clear spelling out and the uniformity of both is, of course, a prerequisite of any meaningful 
system of delinquency statistics. The question of the kind of statistics serving best us an index of delinquent be- 
havior is discussed next, and the juvenile court statistics are given over-all preference. The difference in that sense 
from adult criminal statistics is related to the different role of the police in delinquency control. Finally, the ways 
toward adequate local, state and national delinquency statistics are analyzed and the issue of sampling versus the 
universe as the ultimate goal is discussed. 





Characteristics of the Insurcd Unemployed. Louis Levine, Bureau of Employment Security. 


Although we have developed many important economic indicators in recent years, unemployment information 
continues to be the most widely recognized and understood index of the health of the economy. During the last 
three decades much progress has been made in the collection and analysis of unemployment data. 

The mass unemployment of the thirties focussed attention on the number of unemployed. As we come to 
know more about unemployment, however, it has become apparent that there are many kinds of unemployment 
and that they have a differential impact on the various groups in the labor force. This realization has brought a 
more sophisticated approach to the measurement of unemployment. For policy and program purposes, it has become 
essential to analyze the structure of unemployment and to measure the impact, in terms of the personal, economic 
and geographic characteristics of the unemployed. 

It is important to distinguish, for example, between the economic impact of the unemployment often en- 
countered by the new entrant or reentrant into the labor force, and that of experienced workers separated from a 
job—the “disemployed unemployed.” It is this latter group with which unemployment insurance is concerned. 
Over four-fifths of all wage and salary workers are now covered by unemployment insurance. 

As a basis for understanding the unemployment problems of this major sector of the work force, information 
on their characteristics: economic, personal and geographic is becoming available monthly. These data will provide 
the backdrop for public policy and program planning. 


Training for Survey Research in Underdeveloped Countries. Rensis Lixert, Jnstitute for Social Research. 


Underdeveloped countries usually lack accurate aggregate statistics. Consequently, they have to rely, even more 
than developed countries, on obtaining such information from sample surveys. The objectives of underdeveloped 
countries in their planning and improvement efforts will be more efficiently and effectively achieved if substantial 
use is made of sample surveys. Unfortunately, there is a serious, and in most cases a complete, lack of indigenous 
personnel trained in survey methodology. 

Technical assistance has been given some countries in sample design, but this is only part of survey methodology. 
Well-designed samples are being used with poor questionnaires and poorly trained interviewers, leading both 
to unnecessarily large random errors and to biases of unknown magnitudes. The value of the data would be sig- 
nificantly increased by the use of better methodology in the design of the questionnaires and in the training and 
supervision of interviewers. 

High priority should be given to the development of survey competence and institutions in these countries. A 
technical assistance program to foster the use of sample surveys should involve both (1) training in the United 
States of personnel from the underdeveloped countries and (2) assistance in establishing and operating sample 
survey organizations in these countries. 


Statistics of the Scoring in a Baseball Game. G. R. Linpsry, Defence Research Board of Canada. 


The best known application of statistics to baseball concerns day-by-day reports of the batting averages. But 
this is done with no regard to the very considerable sampling error. Too many significant figures are retained, and 
unjustified conclusions regarding the promotion or demotion of players are drawn. 

A very different aspect of baseball which can be examined by statistical methods is the progress of the score 
throughout the course of the game. One tends to remember close games and dramatic finishes. The object of this 
study is to see whether some innings are actually more productive than others, and whether there is in fact a 
tendency for the lead to change hands, or, conversely, for the leading team to stay ahead. 

The conclusions are that all innings are not the same. The first and third are the most productive, the second 
the least. There is a tendency towards very low and very high scores and very low and very high final winning 
margin. The observed frequency of overcoming a lead agrees closely with the prediction based on probability and 
assuming complete independence between half-innings. Thus there is no measurable bias to meintain or to over- 
come a lead. 
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Short Range Prospects for Stock Prices. J. A. Lrvinastron, Philadelphia Bulletin and Syndicated Columnist. 


This is the 10th bear market in this century. Nine of these, including this one, have developed out of unfavor- 
able money market conditions. An economic and stock market boom perpetuates itself until money for further 
economic expansion becomes short and the charge for money becomes high. Thus, the present stock market decline 
is a correction of the disparity between yields on stocks and yields on bonds. 

The decline also corrects the unusually high price-earnings ratio prevailing at the end of last year and the 
beginning of this year. This is the only way these two imbalances could have been righted since corporate per share 
earnings have remained relatively static over the past five years despite a rising volume of business. 

An upturn in stock prices won't take place until (a ) a sharp recovery in corporate profits seems probable, or 
(b) until the decline carries to a level at which earnings and dividends again are conducive to common stock in- 
vestment for income. 


Bayesian Tabulati and Confid Belts. Mrrcnett O. Locks, University of California, Los Angeles. 





The statistic ¢ representing the number of successes out of ¢+d observations has useful properties in the 
analysis of Bernoulli processes. Tabulations have been made of the distribution of c/e-+d by digital computers. 
These tabulations are used to construct confidence belts for the interval estimates of the unknown parameter p 
which are compared to those obtained by the original Pearson and Clopper belts. 


Obtaining Data on Consumption, Saving and Family Budgets in Underdeveloped Countries. E. Scorr MaYnes, 


University of Minnesota. 


This paper constitutes a generalization of the author's experience in conducting a personal interview survey of 
saving in Old and New Delhi, India. 

Statistical consultants in underdeveloped countries must operate in unfamiliar cultural environmeats. Facets 
of the Indian cultural environment, affecting the conduct of surveys but not generally covered in guidebooks or 
official sources, are discussed, These include such factors as non-communication between theoretical and applied 
statisticians, the problem of knowing when “agreements” have been reached, hiring practices and job tenure, caste 
and supervision. 

Response and nonresponse errors in surveys are attributed to three possible sources: failure to motivate the 
respondent, failures in communication between interviewer and respondent, and non-accessibility of information 
to the respondent. Techniques to minimise errors from each source are discussed. The chief source of errors in 
Indian financial surveys to date appears to be failures in communication, as contrasted to the U. S., where failures 
to motivate the respondent are most important. If true, this is fortunate since failures in communication are more 
easily remedied than failures in motivation. 

Though validation of estimates of saving for individual households is impossible, partial evidence indicates 
that techniques borrowed from American financial surveys “worked,” yielding valid saving data in an urban Indian 
setting. 


Prisoner Statistics—National and State. James A. McCarrertry, U.S. Bureau of Prisons. 


The National Prisoner Statistics series was begun in 1926 by the U.S. Bureau of the Censua. In 1950 the re- 
sponsibility for the series was transferred to the U.S. Bureau of Prisons. This program collects on an annual basis 
data on prisoners received into and discharged by State and Federal correctional institutes for adult offenders. In 
addition the series provides the only national data on persons employed in such institutions, and on executions 
carried out by civil authorities in the United States. 

Since 1950 the Bureau of Prisons has (1) obtained the cooperation of all the States in reporting prisoner and 
related data; (2) caught up the processing on a tremendous backlog of prisoner data; (3) pioneered with eight States 
in the use, by the Bureau of Prisons, of punch cards containing data on adult prisoners and anticipates in short time 
six more States joining this State-Federal cooperative program and (4) partially brought up to date the published 
series through the release of pamphlet bulletins on prisoner population, prison employees , and executions. In process 
are detailed reports covering court commitments (years 1956-1957) and discharges (years 1954, 1955 and 1956). 

This series shows that in the last two decades, while the number of prisoners received from court and confined 
in State and Federal institutions has increased, the proportion of these individuals to the civilian population was 
lower in 1959 than in 1940, 

1940 1949 1959 
Prisoners present December 31 173 ,706 163 ,749 207 ,513 
*Rate 131.9 111.0 119.0 
Prisoners received from court 73,104 68 ,925 87 , 586 
*Rate 55.5 46.7 50.2 


* Rate per 100,000 of the estimated civilian population. 
Source: National Prisoner Statistics, No. 24, table 1, July 1960. 


A New Measure of Corporate Profits. Epmunp A. Mennis, Wellington Management Company. 


This paper first describes briefly the various measures of corporate profits currently available, indicating the 
limitations of their usefulness. especially for fi ial analysis and administration. 

A sample is then presented of profit data for 110 large corporations in 24 industries, which account for some 
$10 billion of after-tax profits a year. Sales, pretax income, net income, dividends, depreciation and capital expendi- 
tures for the years 1947-59 have been gathered, using data as reported to shareholders. The composition of the 
sample and a brief description of its historical reeord are presented. To the extent possible, the sample is reclassified 
into minor industrial groups and compared with similar groups in the 1957 Statistics of Income. 
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As an illustration of the use of the sample for current analysis, the expectations of profits by industry for 1960 
are given, as estimated in June by the Wellington research staff. Estimates were prepared using uniform business 
assumptions and after analysis and field interviews. The industry expectations are compared with actual results for 
1959 and include sales, pretax and net income indexes, ratios of pretax and net income to sales, dividend payouts, 
increases in dividend payments, capital expenditures and coverage of such expenditures by internally generated 
funds. 

It is believed that this sample, covering a broad list of large corporations, presents more information for more 
recent periods in a manner more useful for financial analysis than other data currently available. 


Uses of Quantity Indexes and the Measurement of Real Value. Mirton Moss, Bourd of Governors cf the Federal 
Reserve System, 


There has just been published this summer a sizeable book on the latest revision of the Federal Reserve index 
of industrial production. The gains and limitations emerging from this undertaking are put before the public in 
150 pages of tables and 80 pages of text including 32 charts. That book is the place to go for a discussion of the 
details of compiling and using the country’s official monthly measure of real output in factories, mines, and electric 
and gas utilities. 

In this paper I propose to discuss certain measurement problems in the light of examples of the customary 
uses of production indexes. The two such uses which are singled out for this discussion are: (1) determnining whether 
the national economy is heading toward prosperity or recession; and (2) measuring economic growth and overall 
efficiency of the economy. 

The underlying theme of this paper is that many conceptual problems can be seen in more balanced perspective 
if these uses are kept in the forefront, and certain esoteric index number questions in the background. Indexes of 
quantity or of “real value” are not ends in themselves which need to be perfected for their own sake—but are a 
means to the understanding of the way our economy works. 


Computers and Statistical Analysis. J. A. Navarro, International Business Machines Corp. 


The use of computers for statistical analysis will be discussed with some emphasis placed on the computer as 
a tool for teaching courses in Statistics. 

The computer plays a role in both processing of data and in the decision-making process. General information 
on the time and cost of processing data with examples will be given. The role of the computer in the decision-making 
process will be discussed, their limitations, possible applications such as reservation systems and automation in 
banking and general techniques to the solution of problems such as simulation Monte Carlo and business game 
techniques. The role of the computer manufacturer as researcher, salesman, applications engineer, and communi- 
cator of knowledge will also be discussed. A final topic will include some possible applications of the computer in 
the area of teaching of statistics in the “classroom.” 


The Flight te the Suburbs Slackens. A. F. Parrorr, Consolidated Edison Co, of New York. 


The flight to the suburbs has been the great mass migration of the postwar period. Though eight of the ten 
largest cities lost population in the last decade, there are many signs that in New York the tide has already turned. 

The suburbs around New York are built up solidly so far out that it is logical for the turn to first become 
evident there. With taxes soaring in the suburbs, commutation deteriorating and becoming more expensive and 
traffic congestion increasing, the suburbanite now spends more and more time and money to attain less and less of 
the amenities for which he left the city. 

This trend has become apparent only in the past three years. The most convincing evidence is found in the 
number of school budgets voted down on Long Island, the sharp decline in net out-migration of school children 
from New York City, the decline in requests for VA appraisals and the sharp rise in dwelling unit starts in the city. 

Elsewhere the turn should first become evident in such large or geographically restricted cities as Chicago, 
Philadelphia and San Francisco. Smaller cities, where it is still possible to reach open country easily, may continue 
to lose population for many years. 


Checking the Validity of the Toll Collectors’ Vehicle Classifications. Kenneru C. Pearson, Massachusetts Port 
Authority. 


A question which often arises at toll facilities that offer reduced rates to regular users is, “How do you know 
that some of the toll collectors are not registering non-commuter vehicles at the reduced commuter rate and then 
pocketing the difference in the toll charged?” The answer usually given is that auditing and visual inspection are 
used to guard against this possibility. The purpose of this paper is to illustrate a statistical method of analysis 
which overcomes the inability of either of these methods to satisfactorily answer the basic question. 

The technique is based on matching the per cent of commuters value for each toll lane against the value that 
occurred in the respective lane last week. The difference is then noted and is ordered. Using the = Ranks/N as a 
standard, each toll collector’s ranks are compared with this value. The number of below average ranks are noted in 
& given sample and if any collector receives a critical value, a non-statistical investigation of his work is initiated 


Design and Control of Mass Calibration Series. H. S. Petser, National Bureau of Standards. 


Requirements for the design of good calibration series for single-pan damped balances are given. 

The conventional method of weighing, in which the difference in indications is taken as the mass difference 
between loads, is illustrated by three examples of weight series. The point is made that the chief weakness of these 
series is the difficulty of recognizing slow balance drifts. Three series are described which nearly eliminate the effect 
of slow drifts in the balance indications. Much simplification is achieved when unknown weights are compared 
with standards of the same denomination. Finally a method for tolerance testing is given. 
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Population Studies in Health Research. G. Sr. J, Perrorr, Conference on Ci ity Population Laboratories. 


An increasing number of ecologic studies relating to health and illness, and other inquiries into population 
behavior are being based on long-time projects involving human population groups. This field has been variously 
referred to as epidemiology, field research, and human medical ecology. The writer is engaged in a study of past 
and current investigations in this field with the idea that useful guide lines may be established for those interested 
in research on the health of groups of human beings. The study is sponsored by the Study Section on Human 
Ecology of the National Institutes of Health, Public Health Service. This paper touches briefly on some of the 
classic contributions to the etiology of disease arrived at through population studies and to more recent investiga- 
tions seeking etiological clues, such as the association between lung cancer and smoking and air pollution; and the 
relation of such factors as diet and physical exercise to the incidence of coronary artery disease. The paper then 
describes the general types of problems toward which population studies have been directed, the various kinds of 
population groups selected as appropriate for these usee, and some of the methodological problems involved. 


Computer Techniques in the Statistical Investigation of Non-Linear Parametric Relations. T. I. Pererson, Inter- 
national Business Machines Corp. 


A computer technique is described which provides for the automatic and systematic exploration of plausible 
functional relations representing experimental data. The functions considered are defined as solutions of systems 
of simultaneous first-order ordinary differential equations with unknown parameters. 

The computer technique uses: (1) a digital language derived from linear graph theory to generate functions; 
(2) a numerical integration method to solve differential equations; and (3) an iterative least squares algorithm for 
the non-linear estimation of associated parameters. 

Application of this technique is shown for a problem in chemical reaction kinetics. 


On Sampling With Varying Probabilities and Without Replacement. J. N. K. Rao, Iowa State University. 


It is well known that the variance of the estimate of the total in sampling with varying probabilities and 
without replacement involves xj, the probability of including ith unit in a sample of size n and x;;, the probability of 
including both units i and j in a sample of size n, It is universally accepted that by making x; proportional to the 
‘size’ x; of the ith unit, considerable reduction in Var({) can be achieved. Hartley and Rao (Abstract, A.8.A. 
Meetings, Washington, D. C., 1959) adopt a simple sampling procedure which is well known to survey practitioners 
for which jax; exactly and derive *;; explicitly in terms of the x; for moderate size populations and n =2, Also, 
compact expressions for Var(}) are obtained to O(N) and O(N®). An importapt merit of this sampling procedure is 
that it permits ready evaluation of x; and hence of Var(?) for n>2. Here, expressions for x;;, and hence Var(?) 
are derived in terms of x; for n>2, which presents certain new features other than those encountered in the deriva- 
tion of x; for n =2. It is interesting to note that most of the published literature do not have anything to offer for 
n>2, and exclusively deal with the case n =2, due to difficulties in evaluating x;;. 


Projecting Enrollments at the University of Toronto. G. pe B. Rosrvson, University of Toronto. 


The problem considered here is that of projecting university enrollment. Usually this is accomplished by com- 
paring an individual university with the local total population of a given age group—in this case the Province of 
Ontario. A more accurate estimate has been obtained by taking the populations in all the schools in the Province 
in the different grades, computing passing percentages between grades and extending these passing percentages into 
the University. The large immigration since the war has complicated this situation but it is believed that the method 
provides valuable data on which to base questions of University policy. 


Urban Concentrationand Occupational Structure in the United States: 1900-1950. Grorers Sasacu, Maurice D. 
Van Anspot, Jr., Hamrp Zanepi, University of Southern California. 


This paper is concerned with measures of urban population concentration with respect to social and economic 
changes associated with urbanization and population distribution. An index based on the Lorenz curve is used to 
describe urban population concentration by consolidated proportions of population and cities for the states of the 
United States for the years 1900-1950. A comparison made with other measures of urban population distribution 
indicated that the Lorenz curve index varies somewhat independently of the “per cent urban” measure used by 
the U. S. Bureau of the Census and is more closely related to an index devised by Kingsley Davis to summarize 
population distribution by community size. It was hypothesized that there would be a definite pattern in the rela- 
tionship between urban concentration and economic development. The initial transition from an agricultural to an 
industrial economy would be accompanied by rapid concentration, while the later stages in economic development 
would be characterized by a slowing down or even a reversal in concentration. This hypothesis is not easily tested, 
particularly since states constitute arbitrary regions and since it is increasingly difficult to isolate regional from 
national trends in a maturing economy. A comparison was made between concentration and occupational structure 
of states in 1880, 1900, 1940, and 1950. The analysis suggests the existence of a definite pattern in the relationship 
between these two variables. 

The findings indicate that research on urbanization needs to take into account more rigorous definitions of 
population distribution and to focus on population redistribution as an important variable in social change. 


Opportunities for Statisticians With International Organizations. Caries F. Sante, UN/FAO Regional Agricul- 
tural Census and Sampling Adviser (on leave from University of Florida). 


The recommendations of the eleventh session of the United Nations Statistical Commission (1960) and the ten 
specialized agencies and nongovernmental international organizations participating is ample evidence of need for 
statistical information and for more reliable methods (census and sampling) for getting the basic data, particularly 
in underdeveloped countries. 
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These agencies have a budget for statistical purposes. One of these agencies, the Food and Agriculture Organiza- 
tion of the United Nations (FAO) is actively seeking statisticians for its Technical Assistance Programs in various 
countries, and for two permanent posts, one at Bangkok and the other at Rome. 

This paper discusses the nature ot the opportunities for statistici indicated by the Commission's recom- 
mendations and other needs recognized by the writer. The opportunity to provide statistical know-how and assist- 
ance to an underdeveloped country gives personal gratification to anyone really interested in the socio-economic 
development of the free world. 





Analysis of Sociometric Structure: A Method of Successive Groupings Based Upon Distance Measures. Jack 
Sawyer anv Terrance A. Nosancuaur, University of Chicago. 


By use of any of the standard judgment models of psychological scaling one may obtain a n7n symmetric 
matrix of perceived distances—on the basis of some relevant attribute or behavior—between all possible pairs for 
n persons. 

Taking this matrix as given, this paper deals with the determination of groupings of points (persons) close 
to one another, reflecting the social structure of the entire set of n persons. (Note that this is a quite different 
question from the more usual one of dimensionality and projections of the set of points, and in the case of social 
structure, orten a more appropriate one.) 

The “method of successive groupings” here presented provides a completely determined, easily applied pro- 
cedure for ascertaining the “optimal” group structure, for any number of groups 1, 2,-+*, n—1, n, where n is 
the number of points. 

The method proceeds by regarding the n points as n 1-point groups, then forming the best set of n —1 groups 
by combining those two “groups” whose interpoint distance is smallest. Next the best set of n —2 groups is formed, 
and so on—at each stage, two of the previously existing groups being combined, until finally, the last two groups 
are combined to form a single group of n points. 

The criterion for combination is the minimization of the sum of the inter-point distances which is between 
points within the same group. This sum is, of course, zero for the initial stage of n —1 point groups. Each combination 
of groups then adds an increasing amount to the sum, and comparison of these increments furnishes a standard 
for determining the number of groups. Thus the method not only provides, for any given number of groups, the 
minimizing structure, but gives some guidance on the question of how many groups there should be. 

A simple analytical procedure has been devised which allows this method to be carried out with relative ease 
on a desk calculator, even for groups as large as fifty. 


Training the Practicing Statistician in the Use of Electronic Computers. M. H. Scuwarrz, Federal Reserve Board. 


We conduct four levels of training in electronic computing for Federal Reserve research persons. 

(1) For staff personnel who are likely to program recurrent or large scale projects: a full course in all the details 
of data preparation, programioing, program testing, and machine organization for punched card and magnetic 
tape operations. Illustrative programs cover a wide range of statistical operations. 

(2) For higher level staff who wish to write their own research or study programs: 6 to 12 hours in the use of an 
interpretive automatic programming technique. Emphasis is on problem solving and a minimum of attention is 
paid to the machine. 

(3) For staff personnel interested in understanding the equipment and its programming, but who do not 
necessarily intend to program: a 19-hour course in programming and machine organization. 

(4) For senior research administrators: a one-week full-time course covering training, planning, programming, 
testing, machine operating, alternative equipments, and computer potentials. Objective is preparation for broad 
supervision. 

For top level administrators at the policy-making level we conduct a series of three 1-hour orientation lectures 
about the equipment and its development, the nature of programming, and the potentials for economic and statis- 
tical research and for Federal Reserve operations generally. 


Medical Care For the Aging: Santa Cruz County Study. ELeanor Suuirer, Dept. of Public Health, County of Santa 
Cruz. 


Santa Cruz County, in addition to and because of its other charms, provides an excellent fund of information 
on medical care for the aging. The County is known as a retirement area, and has some 14 per cent of its population 
aged 65 and over, compared with some 8 per cent for the nation as a whole. One-third of the people in this bracket 
are on old age assistance, known in California as Old Age Security, are receiving a wide range of medical services 
with but little limitation on some of them, and with complete records. The Ceunty has an active local health pro- 
gram, including the first geriatric clinic for preventive diagnostic screening in the country. Several senior citizen 
social groups in the town guarantee that information (and misinformation) spread ra.idly by word of mouth. 

The study will cover the medical care of all OAS recipients over a period of up to seven years, cutting off at 
12/31/61. Unit and dollar figures will be collected on utilization, variation in utilization, and estimates of actual 
“need” as contrasted to utilization. Relationships with age, sex, duration on OAS, and some sociological factors 
will be explored. As a by-product, we will have some figures that will permit calculation of income elasticity of 
demand for certain goods and services, non-medical as well as medical, under the specified conditions. 

With respect to particular ailments, we will have figures on incidence, prevalence, mortality, pattern and 
quantity of services utilized, ete. Of course, for many ail ts the bers invelved will be too small to yield 
significant statistics; but they will be available for pooling with data from other sources. 

We may also expect statistics that will aid in the medical post-audit and policing of practiti For instance, 
we will have, by doctor, such figures as frequency of various diagnoses, per cent of new patients diagnosed te have 
ailments that require further medical expense, ratio of prescriptions to visits. We will have hospital costs paid by 
private insurance companies compared with premium payments received and balance left to be paid by other means. 
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While our records will certainly include the bulk of the medical care received by this population, we appreciate 
the fact that there are other sources of medical care, whose extent we will try to catch by personal interview but 
which may well remain understated. These are the medical care provided the recipient through his own funds or 
funds of friends and kin or through the generosity of the private practitioners. 


The Concept of “Usual Residence” in the Census of Population. Henry 8S. Suryrock, Jn., Bureau of the Census. 


Where people are counted is important not only for the presentation of statistics on various subjects but also 
for the apportionment of representatives in Congress and State legislatures and for other legal purposes. The use 
of the concept of “usual residence” is traced through the history of the Census of Population of the United States 
and is compared with other concepts such as “de jure” and “de facto” population. “Usual residence” as defined in 
the Census of Population does not necessarily connote permanent residence, and the official definition may differ 
from the interpretation given by the general public. The treatment of certain problem groups is discussed, for exam- 
ple: (1) members of the armed forces, (2) college students, (3) inmates of institutions, (4) persons with more than 
one residence, and (5) persons with no usual place of residence. Finally, recent expansions in the program for the 
enumeration of Americans abroad are described. 


A Critique of Inventory Forecasting Techniques. T. M. Stransack, Jr., New York University. 


This paper seeks to establish criteria for assessing the validity of certain approaches to short term forecasting 
of aggregate inventory investment. In addition, measures are made of forecasting accuracy of the several approaches 
during recent years. In developing criteria, reference is made to evidence of the nature of inventory behavior re- 
vealed by recent empirical work of Darling, Lovell, Turlecski and of the author as well as to earlier work of Abramo- 
vitz and others, The roles of a number of variables including desired inventory to sales ratios, price changes, orders 
and unfilled order backlogs are examined. Forecasting techniques which are tested are those of Turlecski (NICB), 
Fortune, and Dun & Bradstreet. 


How Should Statistical Methods be Adapted for Use in Auditing and Control? Frepericx F. Srepnan, Princeton 


University. 


When statistical methods are introduced in a new field of application, they pass through three stages of adapta- 
tion. First, methods found useful in other fields are applied to the new material with only superficial changes. 
Second, new devices and procedures are invented to make the imported methods more suitable under the new 
conditions. Third, the modified methods are appraised, in the light of accumulating experience, for their effectiveness 
for particular kinds of problems. 

To remove obstacles to this development we need to discover, through cooperation between accountants and 
statisticians, the problems that must be overcome. Some modifications of terminology and procedure will be re- 
quired. Methods commonly used with isolated sets of data must be fitted to the systematic relationships between 
accounts and their supporting documents. The typical modes of thinking about statistical problems must be related 
effectively to the modes of thinking of accountants, auditors, and controllers. Statisticians have much to cffer and 
should prepare it well for export. Both professions will gain greatly from effective cooperation. 


Decision Rules in Reliability Testing. Davin S. Srotter, The RA ND Corp. 


There are many kinds of equipments which cannot be tested for their reliability performance without being 
partially or completely destroyed or expended in the process, for example: resistors, signal flares, and satellites. 
When an item is relatively cheap, like resistors, a great deal of information on the reliability of the item can be 
obtained at a very small cost. When the item is relatively expensive, like satellites, the cost of small amounts of 
information on reliability is considerable. This paper develops some decision rules appropriate to the problem of 
choosing sample sizes for non-recoverable reliability proof testing of relatively expensive items. 

The decision rules are based on the nature of utility functions, U(n, R), where n is the number of items expended 
in reliability proof testing, and R is the reliability parameter; and on the behavior of information functions, U,(n, R), 
where ¢ is the probability that the information does not exceed a specified amount. Several kinds of decision rules 
are discussed, including Bayesian, min-max, and nearly min-max. Some practical rules-of-thumb are given. 


Long-Term Prospects for Industrial Production. J. C. Swantiey, American Telephone and Telegraph Company. 


The prediction of future growth of the United States is an expression of faith in the continuation of the American 
economic tradition. This encompasses maintenance of the private enterprise system where the market place will be 
allowed to specify the kinds and quantities of products to be made, and preservation of an economic climate in 
which savings, investment and innovation will be encouraged. 

In the next decade, the population in the dependent age groups will continue to increase relative to the popula- 
tion of working ages. Moreover, over half of the total increase in the population of working ages will be less than 25 
years of age. As a result, we may be faced with an over-supply of less skilled workers coincidental with a shortage 
of trained workers. Educational facilities will need to be expanded and substantial increases in capital outlays will 
be required to maintain the postwar rate of increase in output. 

These considerations suggest that increases in total national output and industrial production are more likely 
to be closer to the postwar average than to some arbitrarily selected higher rate that may be considered to be 
desirable. 


A Computer Application Providing a Basis for Forecasting With Business Time Series. Z. Z. Szarrowsx1, Inter- 
national Business Machines Corp. 


This paper describes a time series analysis yielding data suitable for 1) forecasting with error estimates, 2) meas- 
uring relationships distinguishing between trends, cycles, etc., 3) analysis of component variance with degrees of 
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freedom estimates, 4) “power spectrum” analysis for non-stationary time series. The procedure regards each run, 
suitably defined, (usually low to high and high to low) as having one degree of freedom and being associated with 
some cause. Averages of runs are in turn, examined for runs and this repeated averaging and run analysis yields 
degrees of freedom and values measuring successively longer cycles and eventually trend. The distribution of periods 
and amplitudes yiclds a kind of power spectrum analysis. The above information provides a basis for analysis of 
variance and forecasting. The procedure appears to be more “efficient” than the National Bureau of Economic Re- 
search approach. The analysis, although simple conceptually, does try to utilize probability concepts and is a logical 
computer application. 


On Problems of Measuring the Distribution of Population in an Urban Area. Martin Tarrex, University of Chicago. 


The orientation for this paper is that of a statistician giving advice and counsel to experts in sociology, economics 
and planning. Assumptions by other investigators as to the centers of urban areas rather than the measurements of 
such centers is shown to place serious limitations upon the validity of their results. Shortcomings in the methods 
of measuring parameters by using the regression of the logarithm of density on distance from center are noted. On 
the positive side, measurements of parameters (including those for the center) under the assumption of random 
sampling from a circular city are developed; a suggestion fur weighting in the case of the regression technique is 
made; the interpretation of “size of city” parameter in the regression technique is discussed; and the need for 
applying Chi-square goodness of fit tests (or better tests) is emphasized. 


The Common Market in Europe and its Marketing Implications to the United States. Kraus Von Do#wnanyi, 
Infratest, Munich, Germany. 


Under the Common Market Treaty a new economic unit is developing on the European continent. With tariff 
barriers and import restrictions being slowly eliminated, and major economic policies becoming increasingly coor- 
dinated, six national markets will become more and more integrated in the 1970's. 

This process of integration, however, primarily involves aspects which can be affected by legislation. In viewing 
the Common Market future, one should not forget that the six countries to be combined not only differ widely 
with regard to their basic economic structure, but that they are today also in very different stages of economic 
growth. Their future economic development and expansion potential will be everything but similar. Mature markets 
such as Switzerland and Belgium, for example, with high per capita income will become part of the same unit as 
Italy. 

Even more important perhaps than these differences in economic growth are the general sociological and ethno- 
psychological differences. Nations with a completely different history, different languages ard different ways of life 
will begin to administer their mutual economic problems together, but the individual problems of the countries 
will remain of major importance for a long time to come. 

American business now planning for marketing in Europe in the 1960's must realize that the Common Market 
Treaty has, of course, important implications with regard to the selection of production or assembly locations, but 
marketing for the consumer will continue to be a matter very closely adapted to the requirements of the individual 
area concerned. Thus, marketing planning will have to use data and statistics which are in most cases available 
only cn a national level and which are incomparable to many instances because of their composition. 

To create a basic pattern of statistical information indispensable for modern business planning for the entire 
Common Market area will be a tough but rewarding job of the future. 


The U. S. Office of Education Statistical Program. Viner. R. Watxer, U. S. Office of Education. 


For almost a century the Office of Education has been reporting on the “condition and progress of education” 
in the United States and is the focal point in the Federal Government for such data. It has developed several series 
of publications which furnish multipurpose statistics in the field of education. For years the best known of these was 
the Biennial Survey of Education. More recently, others, such as those dealing with degrees of fall enrollment, 
have also become very widely used. These series, combined with many special purpose studies, have provided many 
data in the form of basic statistics concerning education from the kindergarten through the graduate level. 

The Office has developed a very high level of rapport with respondents and lent response rates in a purely 
voluntary reporting structure. Constant attention has been given to maintaining high levels of precision, reliability, 
and validity; increased use of sampling; close working relationships between specialists in education and statistical 
personnel; a growing rigor in the application of statistical standards; and the extension of sound research design 
and principles in depth studies and in projections and estimates. 

Several major problems remain to be solved to insure continued effective conduct of the statistical mission 
of the Office. If we are not fully aware of the current status of education, what lies ahead of us, and what can be 
done to identify and attack current and emerging problems, a national crisis in education may come upon us for 
which we will be totally unprepared. 

A coordinated program must be developed to fill major gaps that have been identified. Uniformly high stand- 
ards of quality need to be applied to all statistical reporte of the Office. More analytical and interpretive studies 
should be developed. Data must be made available in a form most appropriate to users and with less delay than at 
present. 

Many of the major problems are now being attacked vigorously along well-defined lines. Program development 
is being syst tized and coordinated. Improved information collection and data processing systems are being 
designed which will speed the production of statistical data without reducing quality and permit collection of new 
types of data. Qualified staff are being bled and assigned to the development and application of statistical 
standards throughout the Office. Research advisory service functions are being strengthened. Interpretive studies 
are increasing in number. Greater attention is being given to adapting the program to the emerging needs of users 
and to improving dissemination. 
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Moving Amplitude Adjustments. E. Craig West, Dominion Bureau of Statistics. 


A process of trial and error lead to the adoption of the moving seasonal amplitude method of seasonal adjust- 
ment to adjust the Canadian unemployment series at the time of the 1958 recession. This method originated by 
Simon Kuznets back in the 1930's is proposed in this paper as having a better theoretical and statistical basis than 
the conventional ratio-to-moving average, moving seasonal approach that is the basis of the Bureau of the Census’ 
Method II. The use of very flexible curves (usually a three-term of a three-term moving average) under the latter 
approach is difficult to defend on the theoretical grounds of defining slow progressive institutional changes in 
seasonals. On the other hand the amplitude technique of measuring seasonal change is an improvement not only 
of detining but particularly of measuring seasonal change. 

While the Census Method may be considered the established general approach, the amplitude method should 
not be overlooked especially in specific trouble areas. Illustrations of its superiority are given for unemployment, a 
a component of corporation profits, housing starts and an export series. In each case the interdependence of the 
cyclical and seasonal! elements of the time series is measured and used to give a superior seasonal adjustment par- 
ticularly in the current period. 


Consumer Perception—A Theoretical Guide for New Techniques in Evaluating Advertising Effectiveness. Invina 8. 
Ware, Creative Research Associates. 


Advertisers’ wishes for a simple predictive tool, in assessing advertising effectiveness, which eliminates con- 
sumer perceptions is highly unrealistic. Consumer perceptions of advertising are crucial indicators of successful 
communication. Two areas of perception—cognition and value—are basic for predicting consumer behavior: 

(1) the product redefinition occasioned by the advertisement 

(2) the product value communicated by the advertisement. 

One research design which allows insight into product redefinition is the use of a “consumer involvement” 
categorization. “High involvement” consumers, whose needs are high in relation to a product category, magnify 
the new meanings contributed by an ad to a product. A design allowing for insight into the new values contributed 
by the advertisement tu the product involves the use of a “loyalty” categorization. “High loyal”consumers magnify 
the source of values in any product. Thus, advertising evaluation must utilize insights into product meanings and 
values tu make longer range sales predictions. 


A Computer Program for the Complete Analysis of Time Series. Rosert M. Wiuiams, University of California. 


This paper described the development and operation of an IBM 709 computer program for the complete 
analysis of monthly time series data. The total program is composed of several sub-routines which can be used 
separately or in one continuous operation. 

The sub-routines perform the following operations: 

(1) Deflation of the time series, if necessary, for price changes. This requires prior selection of the proper price 
deflator. 

(2) Fitting each of several trend types to the data, including straight line, exponential, modified exponential, 
second degree, and Gompertz. Automatic selection of the trend of “best fit” by the least-squares criterion. 

(3) Computation of moving seasonal indexes by the ratio-to-moving-average method. In this computation, 
the trend types listed in (2) are fitted to the ratios for each individual month, and the trend of “best fit” is selected. 

(4) Trend and seasonal components are eliminated from the original or price deflated data. Irregular fluctua- 
tions then are eliminated by a weighted moving average, leaving the cyclical component. 

(5) Finally, the irregular component is isolated, and the residuals are tested for the existence of autocorrelation 


The Experimenta! Study of Style. Wattrer A. Woops, Nowland and Company, Inc. 


It is suggested that experimental research has suffered from an over-emphaais on survey research in the study 
of style among marketing and consumer research practitioners. 

Broader experimentation is required. In the study of style, it must be recognized that all perceptual modalities— 
gustatory, olfactory—as well as visual and auditory are subject to stylistic responses. Further, “life-style” overrides 
style considerations for any modality and must be considered in the prediction of style acceptance for any single 
modality. 

Five variables which are relevant to the study of style are discussed: (1) behavior responses which are genetic 
in origin, in contrast to those resulting from external sources; (2) behavior responses which are determined by the 
developmental level of the perceiver; (3) behavior responses which emanate from the problem solving (a cognitive) 
approach of the perceiver; (4) the influence of “leader” groups; (5) response differences resulting from the psycho- 
logical states of the organism—satiation and adaptation. 

The prediction of style is contingent on t of resp of such controlled variables as these. 





The Robustness of Life Testing Procedures Derived from the Exponential Distribution. M. ZeLen anp Mary C. 
DannemiL_eR, National Bureau of Standards. 


Almost all the statistical procedures in current use for evaluating the reliability of components or equipment 
rest on the assumption that the failure times follow the exponential distribution. However, in practical situations 
one rarcly has enough data to determine whether failure times are actually exponential. This paper studies the 
behavior of several statistical life testing procedures based on the exponential failure law if the true failure law is 
the Weibull distribution. It is found that these statistical techniques, which are widely used, are very sensitive to 
departures from initial assumptions. Applying these techniques to life test data when the exponential failure law 
is not satisfied may result in substantially increasing the probability of accepting comp te or equip ts having 
poor mean-time-to-failure. 
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The Theory of Storage. P. A. P. Moran. London: Methuen and Co., 1959. Pp. 111. 21s. 
Hersert Scarr, Stanford University 


NE of the fields of operations research which has been studied less extensively in 

this country than elsewhere is the analysis of the stochastic processes arising in 
water storage systems. Professor Moran, who has been quite influential in the de- 
velopment of this research, has written an excellent summary. Apparently the pub- 
lication of the volume was held up for a considerable length of time and some of the 
more recent work on the theory of dams has not been included. 

The following example is typical of the type of situation which is analyzed. We 
examine a water storage system (say a dam or reservoir) whose supply of water is 
replenished in a random fashion over time. The system is assumed to have a finite 
capacity so that any excess of water over that capacity is lost if not used immediately. 
Decisions are made as to the release of water at various moments of time, in order 
perhaps to provide hydroelectric power, or for the purposes of irrigation. 

Any conceivable set of release rules will give rise to a stochastic process and it is 
therefore, in principle at least, possible to obtain means and distributions for quanti- 
ties of interest. In practice, however, these analytic calculations are extremely for- 
midable unless very restrictive assumptions are made as to the nature of the distribu- 
tion of incoming rainfall, and very simple release rules are utilized. 

Professor Moran remarks on the similarity between the theory of dams and the 
analysis of inventory policies, both being concerned with storage systems. In fact, 
Chapter IT of this monograph consists of an analysis of several simple inventory 
models. While the mathematical techniques used in these two fields do occasionally 
have points of similarity, there is at least one important difference which should be 
kept in mind. The practical application of inventory techniques at present are in the 
control of large numbers of inexpensive, common items. For such items, approximate 
analytic solutions to relatively simple models are in order. The costs of a more elab- 
orate study would not in most cases be warranted by the modest improvement in 
inventory policies. In the construction of a dam or reservoir, the situation is clearly 
quite different, and the value of an analytic approach based on somewhat restrictive 
assumptions would seem to be more questionable. In this respect it is unfortunate 
that so little space is devoted to the considerably more flexible techniques by means 
of which optimal release rules could be calculated. 


Dynamic Programming and Markov Processes. Ronald A. Howard. New York: John 
Wiley and Sons, Inc., 1960. Pp. viii, 136. $5.75. 


MARSHALL FreimER, Lincoln Laboratory, M.I.T. 


HE problems dealt with in this book differ from the usual Markov process problems 
Tis two respects. First of all, a monetary reward accompanies each state transition. 
And secondly, the transition probabilities and rewards are subject to choice. The 
problem is to choose these so as to maximize the expected reward. 

Howard considers both discrete and continuous time processes, with and without 
discounting of future rewards. For short range processes, the usual value-iteration 
method of dynamic programming is satisfactory. For processes extending infinitely 
far into the future, Howard develops a policy-iteration method. This makes use of an 
iteration cycle consisting of a value-determination operation and a policy-improve- 
ment routine. Although the specific forms of these parts change for the various proc- 
esses considered, all seem well suited for use on a digital computer. 

In analyzing the problems Howard makes use of generating functions (which he 
calls z-transforms) for the discrete time processes, and Laplace transforms for the 
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continuous cases. Certain difficult questions which arise in applying these techniques 
are dealt with by generalizing from examples (see pp. 9-12). It would seem appropri- 
ate to give references to the proofs of these points, but Howard does not. 

The material in this book is presented clearly, with many carefully worked out 
examples. It should be accessible to any graduate student in engineering or a physical 
science. 


An Introduction to Mathematical Statistics. H. D. Brunk. New York: Ginn and Company, 
1960. Pp. x, 403. $7.00. 


GrorGeE EK. Nicnouson, Jr., University of North Carolina 


Hs text is intended to serve as an introduction to probability and the funda- 
footie concepts of mathematical statistics for students who have studied the 
calculus. It is written to provide material for a one semester, three-hour course. 
Sufficient optional material is included in the book, however, to make it suitable for 
courses for which more time is available. 

The book is divided into two parts. Part one, Introduction to Probability, is cov- 
ered in the first 92 pages. Part two, based upon all of the material in part one, Intro- 
duction to Statistics, is covered in the remaining 268 pages of text. The book contains 
33 pages of tables. 

The treatment of the topics covered in the book is in the modern spirit of the axio- 
matic development of probability theory and the development of the principles of 
mathematical statistics from the point of view of hypothesis testing and estimation. 

Probability is covered in five chapters as follows: Chapter 1, Elementary Probabil- 
ity Spaces; Chapter 2, General Probability Spaces; Chapter 3, Random Variables; 
Chapter 4, Combined Random Variables; Chapter 5, The Algebra of Expectations. 
These chapters are carefully written and are accompanied by good illustrative ex- 
amples and sets of problems. 

Statistics is covered in eight basic chapters: Chapter 6, Random Sampling; Chap- 
ter 7, Law of Large Numbers, Chapter 8, Estimation of Parameters, Chapter 9, 
Central Limit Theorem, Chapter 10, Confidence Intervals and Tests of Hypotheses, 
Chapter 12, Regression; Chapter 13, Sampling from a Normal Population; Chapter 
14, Testing Hypotheses. Optional topics included in this section are covered in 
Chapter 11, Statistical Decision Theory; Chapter 15, Experimental Design and An- 
alysis of Variance; Chapter 16, Other Sampling Methods and Chapter 17, Distribu- 
tion Free Methods. 

Thirteen tables are contained in the book. These are Binomial Probability func- 
tion, Summed Binomial Probability function, Poisson Probability function, Normal 
Distribution, Chi Square Distribution, “Student’s” Distribution, F Distribution, 
Values of e~*, Natural Logarithms of Numbers, Common Logarithms of Numbers, 
Random Numbers, Kolmogorov’s Statistic. 

In this reviewer’s opinion the book is a welcome addition to the texts already avail- 
able for use in an introductory course in mathematical statistics. All of the funda- 
mental ideas which ought to be included in such a course are covered in a manner 
which will be generally acceptable to most mathematical statisticians. Students who 
have had good previous mathematical training should find this book a valuable text 
if the material is presented by a competent instructor. 

References are collected together in the back of the book and ought to be more 
inclusive for a book of this kind. Particularly in an introductory text some mention 
should be made to a text on Design of Experiments and certainly to either or both of 
Fisher’s books, Statistical Methods and Design of Experiments. 
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The book seems remarkably free of errors and appears to have been carefully 
written and edited. 


Modern Probability Theory and Its Applications. Emanuel Parzen. New York: John 
Wiley and Sons, Inc., 1960. Pp. xv, 464. $10.75. 


J. Laurie Snewy, Dartmouth College 


wo decades ago, one wishing to teach a course in probability theory would have 

had a choice of one or two rather unsuitable text books. In the past decade at 
least ten books on probability theory have appeared, covering a wide range of topics 
and levels of difficulty. Before the appearance of Parzen’s book there remained two 
obvious gaps. One of these was noted in the preface of Doob’s book on Stochastic 
Processes when he complained that there was no standard reference on probability 
theory which would provide the prerequisite for reading his book. This gap remains. 
The other gap was a book which would provide the background of probability neces- 
sary for a student who wishes to study statistics or one of the many areas of applica- 
tion of probability. The bock under consideration admirably fills this gap. 

The first three chapters of the book deal with sample space, the assignment of 
probabilities, conditional probability, independent experiments, and a brief discus- 
sion of Markov chains. These topics are treated for finite experiments, permitting a 
careful discussion with a minimum of mathematics. Chapters 4 through 8 discuss the 
theory associated with random variables and distribution functions. The normal, 
Poisson, and other related distributions are discussed and many applications given. 
For these chapters the reader needs a knowledge of calculus, including calculus of 
more than one variable. The author suggests that one year is sufficient; the reviewer 
would suggest two. The first eight chapters then constitute a text for a one-term 
junior or senior level undergraduate course in probability theory. Chapters 9 and 10 
are of a more advanced nature and treat characteristic functions and their applica- 
tion in proving limit theorems for sums of independent random variables. The ma- 
terial here is not easily available elsewhere, and these chapters are a valuable addition 
to the book. 

The book is written with considerable care, with the exception of a rather mysteri- 
ous definition of random phenomena and random events given in Chapter 1. This 
seems to be one last attempt to define randomness using the frequency notion. As 
might be expected, this definition is never really used in the book and the author fol- 
lows the now generally accepted mathematical treatment of sample space and meas- 
ure of events. An unnecessary attempt at being careful has led the author to use the 
notion of Borel set when clearly no real understanding of such matters is intended. 

There are a large number of illustrative examples which present a remarkable 
variety of applications of probability. The book also has interesting exercises, both 
theoretical and applied, at the end of each section. 

The reviewer would have wished that the author had done more to give the flavor 
of stochastic processes. This would be more in the spirit of modern probability theory. 

Parzen’s book is apt to be compared with Feller’s Iniroduction to Probability Theory 
and its Applications, and hence some remarks on this comparison might be made. 
Both books strive for completeness but in quite a different way. Parzen attempts to 
give as much as possible of the foundations of probability theory without using 
measure theory. He treats, therefore, both the discrete and continuous experiments. 
He keeps the level of difficulty fairly uniform. Feller restricts himself to discrete ex- 
periments and gives a very complete discussion. So complete that his book has served 
both as a text and a standard reference for this part of probability theory. The com- 
pleteness causes the level of difficulty to be quite uneven. Feller’s book has been re- 
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sponsible for much of the rapid growth of probability theory in the past decade. It 
has itself, perhaps, created the need for a somewhat more comprehensive treatment 
of the foundations of probability theory at the undergraduate level. Parzen’s book 
provides this treatment. 


Analysis of Straight-Line Data. Forman S. Acton. New York: John Wiley and Sons, Inc.., 
1959. Pp. vii, 267. $9.00. 


R. L. ANpErson, North Carolina State College 


His book was written to explore the analysis of experimental data that can be 

described in terms of linear relationships. In Chapter 1 (Choice of a Model), the 
author discusses the appropriate models for seven experimental situations; however, 
there is too little connection between this introduction and subsequent chapters. For 
example, various methods are presented of estimating the parameters in a linear 
model when both z and y are in error, without making it clear that this is a serious 
problem only when the functional relationship between the expected values of y and 
z is desired. The author fails to mention an excellent discussion of this problem by 
the late C. P. Winsor in an article, “Which Regression?” in the 1946 Biometrics 
Bulletin. 

Apparently the author feels that the experimenter’s primary interest is in func- 
tional relationships and not prediction; if so, he should have made this clear in the 
introduction. This is in contrast to the procedure followed by E. J. Williams in his 
recent book, Regression Analysis (also a Wiley publication in 1959). Williams makes 
it clear that he is primarily interested in prediction; hence, he devotes only 14 pages 
(out of 208) to the topic of functional relationships. 

Analysis of variance techniques are used throughout the book, often without suffi- 
cient theoretical background. Many terms are inadequately explained. A useful addi- 
tion to the book would be an appendix of definitions of terms and symbols. The pro- 
cedures would have been clarified by the use of more examples, and by separating ex- 
amples from text material, as is done so well by Williams. Another improvement 
would be to cite more references, especially on the theoretical background. 

Chapter 2 on the Classical Model (z non-random) is devoted to the usual least 
squares estimates; for some unknown reason the minimum variance property for 
unbiased linear estimators is not mentioned. An undue amount of emphasis through- 
out the book is placed on the reduction in computing time and errors by the removal 
of an approximate line to reduce the size of the y variable. My experiences with errors 
made in decoding indicate that such a reduction may be illusory. Readers may be 
confused by the subsequent use of these assumed parameters in setting confidence 
intervals. This reviewer approves the use of the term “confidence limits” for limits 
on the prediction of an additional yo for an additional x; many statisticians (Williams 
included) call these tolerance limits. Acton also discusses tolerance limits, but in the 
sense of the proportion of the population contained within certain limits. One sec- 
tion is devoted to Modified Models for One Dependent Variable, including the Nair- 
Shrivastava method of dividing the data into three groups and a number of non- 
parametric procedures. A final section considers the problem of deciding whether 
two sets of y’s (for the same or different z’s) have the same prediction equation. 

Variance component analysis is introduced in Chapter 3 (Regression with Several 
Values of y for Each Known z). Since many readers of this book may not be too 
familiar with variance component analysis, introducing the topic with an example 
based on unequal numbers per class probably is not the best procedure. A conclud- 
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ing section on the theory of variance components for finite populations seems some- 
what out of place in a methodology book. In the model 


Vig Mr t+ BE + & + j(t = 1,2,--+, mn; 7 = 1,2,---, k). 


It is assumed that the nk e’s come from a population of size N instead of the usual 
assumption that the k e’s for each i come from a population of size K. The expected 
values for the two situations are not the same. 

In Chapter 5 (Regression with Both z and y in Error), emphasis is placed on maxi- 
mum-likelihood estimation when two of the three required parametric relations are 
assumed known. Five methods of using available data to estimate the relationship 
are presented. This presentation could have been improved by including a general 
summary of the methods, with a combined table to compare the various confidence 
regions. The reader will note a serious computing error on page 154, which would 
have been detected if the various sets of confidence limits had been compared. The 
author fails to note that the Nair-Shrivastava procedure is often used when the er- 
rors are not correlated, also the similar Wald-Bartlett and Gibson-Jowett procedures. 

In Chapter 6 (Several Lines; the Analysis of Variance), one wonders why the an- 
alysis of variance is restricted to this title when it is used throughout the book. The 
author manages to insert some material on multiple comparisons in this chapter; 
contrary to the views of another reviewer of this book, who says this material is too 
sketchy, I feel this topic has occupied too much time of too many people already. 
The use of multiple comparison techniques is becoming a substitute for thinking by 
the experimenter as to which comparisons he really should make. In this reviewer’s 
opinion the only serious problem of this nature is the one in genetics, for example, of 
selecting the best a% from a group of strains. 

Chapter 7 (The Exposure of Curvature: Orthogonal Polynomials), makes use of a 
computing method by Crout (Williams uses this same method). This is not much dif- 
ferent from the widely used Abbreviated Doolittle or Square Root Methods. 

Other chapters include Samples from Bivariate Normal Populations, The Use of 
Transformations, The Rejection of Unwanted Data and Cumulative Data: the Fad- 
ing Line. 

Although the reader will find it hard to follow many of the procedures presented 
in this book, he will be adequately rewarded if he takes to heart some of Dr. Acton’s 
keen observations on the analysis of data. It is a pity remarks such as these were not 
underscored: 


p. viii: Mathematical rigor is desirable, but its absence is not a just basis for rejec- 
tion. 

p. 28: Data are expensive, and mistakes are expensive; the decision is thus one of 
balancing the cost of gathering more information against the cost of making 
a wrong move because of insufficient information. 

p. 178: The analyst can more profitably treat sets of data which are balanced, and 
all experimental men may take this as a strong plea for such balanced sets of 
data when they design their experiments. 

p. 192: The agreement between duplicates is an overly optimistic estimate of repro- 
ducibility. 

p. 193: When faced with a mutually exclusive choice between a transformation for 
uniform variance and one for a linear relationship, this writer must almost 
always decide in favor of the former. 

p. 224: Physical scientists and engineers need not be encouraged to ignore an ob- 
stinate outlying datum—rather they need to be held in check. 
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A History of the International Statistical Institute, 1885-1960. J. W. Nizon. Geneva: The 
Hague, 1960. Pp. 188. No price listed. 


GertrupDE M. Cox, Research Triangle Institute, North Carolina 


His book tells of the development and reorganization of the International Sta- 
tistical Institute, while also discussing the Institute’s scientific and educational 
work in an interesting manner. 

The book notes that international co-operation of statisticians is one of the oldest 
forms of organized contact between scientific workers on an international basis. This 
contact began during the middle of the 19th century when a period of rapid indus- 
trial development existed in Europe, and public interest grew in questions relating to 
the conditions of the people of the world. In the years 1830-1850, statistical offices 
were set up in many countries and national statistical societies were founded. 

From the need for an exchange of knowledge and experience official statisticians 
held their first International Statistical Congress in Brussels in 1853. The Congress 
dealt primarily with administrative and official statistics and with the question of 
a statistical organization. The members proposed that a central statistical commis- 
sion be set up in each country; these commissions in turn were to be attached to an 
international congress charged with establishing comparability between statistics 
published in different countries. The Congress held a series of sessions indirectly 
leading to the founding of the International Statistical Institute of 156 members in 
1885. The Institute held its first session in Rome in 1887. Until 1913 it held a session 
every two years, the Permanent Office being established in The Hague in 1913. 

The war of 1914-1918 severely halted the Institute’s work. Following the war, 
several international statistical organizations were established. These organizations, 
such as the International Statistical Commission of 1920, forced the Institute to 
alter its aims and purposes. The Institute was proud of the prestige it had acquired 
both in governmental and non-governmental statistics. Also, by this time, most of 
the leading governmental and official statisticians had been elected members. The 
Institute tried to maintain its position both as an independent, autonomous body of 
impartial statisticians and as a semi-governmental institution which made recom- 
mendations to governments on the methods of compilation of their statistics in the 
interest of international comparability. 

The war of 1939-1945 emphasized the fact that the Institute must give up some of 
its most cherished functions and take up new ones. Thus, the Institute abandoned its 
aim of submitting recommendations on statistical matters to governments; it 
ceased to publish international statistical yearbooks covering national statistics; it 
widened its membership to include other than official statisticians; and it endeavored 
to maintain close collaboration with national and international agencies. 

The period of twelve years from 1947-1960 has been the most revolutionary in the 
whole of the Institute’s seventy-five years. In 1947, the Institute adopted a new 
constitution which made substantial changes in the objectives and organization of 
the Institute. It has become an Institute, truly Statistical, and more International. 

For the first 60 years, the Bureau administered the Institute, but recently members 
have taken more interest in the work and future of the Institute. Strength has come 
from required rotation among officers and from greater collaboration with affiliated 
organizations. The Institute is making contributions to the science of statistics, and 
its sessions now consider an increasing range of statistical areas. 

The early efforts of the Institute did lead to definite improvements in the interna- 
tional official statistics. The Institute should not neglect this aim as it gives more 
emphasis to such areas as sampling theory and practice, social and demographic 
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statistics, probability relationships, experimental design, statistical inference, in- 
dustrial applications and the uses of statistics as an aid to physical science research. 

Some committees have accented the progress with their programs. The Committee 
on Statistical Education (1950) has promoted short seminars and has helped estab- 
lish the Calcutta and Beirut Training Centers. The Committee on Statistics in 
Industry and Technology (1951) has had a series of meetings to promote its interests 
and has established the International Journal of Abstracts on Statistical Methods in 
Industry (1954). The Municipal Statistical Section started as a Committee (1953), 
and now publishes the only official statistics still being compiled by the Institute. Re- 
cently, a Committee on Statistics for Physical Sciences was established which has 
been active in planning the programs and special seminars held by the Institute. 

Besides the work of committees, the Institute has helped substantially in the diffu- 
sion of statistical information with a Dictionary of Statistical Terms published in 
1957 and an International Journal of Abstracts; Statistical Theory and Methods 
started in 1959. The Review Journal is being rejuvenated to meet the present challeng- 
ing program objectives of the Institute. 

Although the Institute has expanded and has changed the character of its member- 
ship, it still retains a small membership (presently totaling only 337 members) as 
well as its main objective, “The development and improvement of statistical methods 
and their application throughout the world.” Hopes for the next ten years are that 
the Institute will continue to contribute not only to substantial advancement in 
statistical methodology and theory but also to the breadth and the stature of its 
own total program. 


Statistical Theory of Communication. Y. W. Lee. New York: John Wiley and Sons, Inc., 
1960. Pp. xvii, 509. $16.75. College edition, $14.00. 


AMIEL FEInstTEIN, University of Illinois 


7 book is based upon a course which the author has offereil for a number of 
years at the Massachusetts Institute of Technology. As such it exhibits the ad- 
vantages of expanding material from a well developed course into a book. The result 
is a smooth and connected presentation of the material covered. 

The author’s intention is to present, within the limits implicit in a one semester 
course, a careful and self-contained account of Wiener’s theory of generalized har- 
monic analysis and linear prediction and filtering. Chapter I, a brief introduction, is 
followed by some one hundred pages devoted to generalized harmonic analysis. A 
familiarity with Fourier series is presumed; thereafter the treatment is methodical, 
with examples all along the way against which the reader may verify his comprehen- 
sion of the material. The next one hundred and forty pages are devoted to a discus- 
sion of random variables, leading up to stationary random processes. The results ob- 
tained up to this point are then applied briefly to the detection of periodic signals by 
correlation techniques. The next objective is the study of linear filtering and predic- 
tion. The treatment again is very complete; every step of Wiener’s classical solution 
is carefully set out and explained. There follows a brief chapter containing several 
applications and extensions of the basic techniques developed. Finally, in order to 
introduce the reader to the methods which Wiener has recently developed for non- 
linear systems, the author concludes with two chapters devoted to the use of ortho- 
normal expansions in the synthesis of linear systems. 

It might appear that the use of five hundred and one pages to cover this material is 
somewhat extravagant. However, the author’s aim was not to write a monograph, but 
rather a textbook for first-year graduate students in electrical engineering. Thus, for 
example, the discussion of random variables and random processes is carried out with- 
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out the introduction of an underlying probability space. It is to the author’s credit 
that the treatment, say, of ergodicity, formal and restricted as it must be, is none the 
less quite cogent. The price paid is the length of exposition already referred to. While 
length in itself is no drawback, the reviewer regrets that the author did not choose to 
make some mention of the measure theoretic concepts to which all rigorous treat- 
ments of random processes nowadays refer. Undoubtedly it was felt that such men- 
tion would not contribute to the basic purpose of the book. Still, the graduate 
student in communication theory today finds it increasingly necessary to consult 
mathematical papers on random processes. The present volume, with little or no in- 
crease in length or sacrifice of clarity, could have included a thorough introduction 
to the viewpoint of the rigorous mathematical treatments. Shorter works have done 
so. But it is the author’s prerogative to select and treat his material as he will. Hav- 
ing done so, he has succeeded very well in producing a most readable account of the 
basic elements of statistical communication theory, which students of the subject are 
likely to find valuable for some years to come. 


Principles of Regression Analysis. R. L. Plackett. New York: Oxford University Press, 
1960. Pp. ix, 173. $5.60. 


Wituram G. Cocuran, Harvard University 


AG quotations from the preface will indicate the scope, level and point of view 
of this book. 


“The field of regression analysis is here supposed to consist of the algebraic theory 
and numerical methods associated with the principle of least squares, its applications 
in the analysis of experimental data, and the construction of experimental designs. . . 

... The emphasis throughout is on the assumptions which are made and on the 
properties of relevant estimates or test criteria, whereas the questions which arise 
when planning an inquiry or interpreting the results of statistical analyses are omitted. 

“A prerequisite for reading this book is some familiarity with the basic elements 
in the theories of statistics, matrices, complex variables, and groups. The class of 
readers envisaged thus consists primarily of mathematicians, either graduates or in 
the final stages of their undergraduate career.” 


The first two chapters contain background material. Chapter 1 deals with the 
solution of linear equations, including the square-root method of matrix inversion, 
Fox’s relaxation method of approximate solution, and a discussion of loss of decimal 
accuracy in matrix computations. Chapter 2 presents the principal results on the 
distribution of quadratic forms and ratios of two quadratic forms in a multivariate 
normal system, relying mainly on the method of characteristic functions for proofs. 

The next two chapters constitute the heart of the book. The normal equations, the 
properties of the least squares estimates, the situation in which the matrix of the 
normal equations is singular and that in which the residuals are correlated (with a 
known covariance matrix) are covered in Chapter 3. The standard tests of linear 
hypotheses and some optimality properties of the F-test occupy Chapter 4. 

The effects of failures in the basic assumptions are next considered (Chapter 5). 
Three types are investigated (i) non-normality in the residuals, (ii) inequality of 
residual variances, (iii) unsuspected correlation among the residuals. In (iii) only 
a precis of some results by Watson are given, but Chapter 7 is devoted to an interest- 
ing account of the status of regression theory when the residuals form a stationary 
autoregressive system. 

The remaining major topics are polynomial regression (Chapter 6) orthogonal ar- 
rays and their application to factorial experiments (Chapter 8) and permutation 
distributions over a randomization set (Chapter 9). 
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Rigorous proofs are given of all the basic results, with sketches of the proof for 
some of the more recent and complex developments. The level of exposition is high, 
though the reader needs a firm grasp of matrix and determinantal theory and (to a 
lesser extent) of contour integration, since results from these areas are used freely. 
At the end of each chapter there is a good selection of mathematical exercises, many 
of them embodying results in the research literature. 

In view of his objectives as stated in the preface, Mr. Plackett has written a suc- 
cessful book. It is a book about mathematics for young mathematicians, rather than 
about statistics for young statisticians. As a possible text for a course on regression 
leading to a Ph.D. in Statistics the book would require much supplementation to ex- 
plain why the topics and results are important to statisticians, to give the students 
training in the difficult business of applying the results to statistical problems and to 
introduce aspects of regression theory omitted by the author. As a reference for extra 
reading during such a course the book is highly welcome, since it brings together in 
a lucid and compact form much of the recent theoretical work on regression. 


The Analysis of Multiple Time-Series. M. H. Quenouille. New York: Hafner Publishing 
Company, 1957. Pp. 105. $4.75. 


T. W. Anverson, Columbia University 


us book reports some research and ideas of the author concerning multivariate 

time series; that is, several series of observations repeated over time. In most of the 
study the statistical model assumed is a stochastic difference equation (also called 
autoregressive scheme); some attention is paid to a model of a finite moving average 
(of independent terms); and occasionally reference is made to a combination of these 
models in which the disturbance of the stochastic difference equation is a moving 
average. These are all special cases of a vector stationary stochastic process. 

The Introduction (Chapter 1) is followed by a chapter on Specification which deals 
with the second-order moments of the process (that is, the variances and covariances 
in the population including lagged covariances). Use of generating functions gives 
relations between the moments and the coefficients of the difference equation and/or 
the moving average. Chapter 3 includes some remarks on Identification, that is, to 
what extent the second-order moments determine the coefficients of the model. 

Chapter 4, Preliminary Investigation, presents five artificial multiple series gen- 
erated by random numbers with assumed stochastic difference equations. Second- 
order population and sample moments up to lags of 5 are computed and discussed. 
The sample moments of each multiple series are used to test the null hypothesis that 
the corresponding difference equation is first-order (that is, Markov). Chapter 5, 
Practical Complications, discusses the effect of assuming one model when another is 
true, such as assuming stationarity when the process is not stationary, and some 
other questions involving modifications of the model. Chapter 6, Estimation, dis- 
cusses estimating the parameters of a stochastic difference equation and also predic- 
tion in several models when the parameters are known. The title of Chapter 7 is Sig- 
nificance and Goodness-of-fit Tests. Chapter 8, Practical Examples: U. 8. Hog 
Series, is an application of some of the proposed methods to a series of 82 observa- 
tions on 5 variables. 

Unfortunately, this book has serious drawbacks that limit its value; it is hard to 
understand, inaccurate, and sometimes in outright error. Perhaps the topic in this 
study of most interest to statisticians is estimation of the coefficients, but this 
topic also involves the greatest inadequacies of the monograph. I review in some de- 
tail the first five pages of Chapter 6 on Estimation (pages 65-9) in order to correct 
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some of the most important errors and to indicate more concretely the kinds of 
models studied. 

The simplest model is the univariate first-order (Markov) stochastic difference 
equation 


(1) Le = pLi-n + €, t{=2,3,---,WN, 


where —1 <p<1 and the (unobservable) ¢, are independently and identically dis- 
tributed with means 0 and variances o*. When the distribution of 2 is given, the 
joint distribution of the sequence is obtained by applying (1) successively. Inference 
in any of these stationary (or near stationary) processes when the mean is assumed 0 
and the distributions are assumed normal is based on the sample serial or lag covari- 
ances 


N 
(2) Cj = pe DD 22-4, j3=0,1,---,N-1, 
N - J taj+1 

and simple modifications of them. At the beginning of this chapter it is indicated 
that C, and two specified modifications of Co are a set of jointly sufficient statistics. 
This statement is correct if x, is assumed a given constant, but the author does not 
mention this condition. This instance is typical of the looseness with which the book 
is written; the author is not careful to indicate the conditions under which a given 
statement is true. The assertion we are concerned with here, however, is incorrect if 
x, is given a distribution so the sequence is stationary, for then the set of sufficient 
statistics is C; and two other modifications of Co. Throughout most of the book the 
model seems to correspond to 2; distributed so the sequence is stationary, but pos- 
sibly the author has changed his model here without informing the reader. When the 
author comes to the multivariate case, he makes the reverse switch, he indicates he 
is dealing with the stationary case, but writes down the likelihood function under the 
assumption 2; is fixed. This shifting of conditions without notice to the reader is, of 
course, disconcerting. 

The modifications of Co are effected by deletions of 27, zy and both, and in large 
samples they are equivalent to Co. Then r;=Ci/C> is equivalent to any of the usual 
definitions of serial correlation (when the mean is known to be 0) and is the estimate 
of p. The question of whether some other function of the observations is preferable 
for given small sample sizes is apparently studied by the author by using the large 
sample theory of several serial correlations r;=(C,/Co. He arrives at an estimate that 
is a complicated function of several r;. This is a peculiar method leading to a peculiar 
result since the best estimate need depend only on the sufficient statistics which do 
not include the serial covariances with lag greater than 1. 

In the case of “superposed error” one observes y;=2z,;-+0;, where v; is random error 
(independently and identically distributed with means 0 and constant variances), 
and defines the statistics in terms of y; The author asserts that the estimate of p 
for large samples is r2/ri(=C2/Ci), which he says “is 100 per cent efficient.” This 
assertion is incorrect. The author makes such erroneous statements, I believe, be- 
cause he does not feel obligated to present evidence for his assertions. What is more 
disturbing to the reader is that often he does not even indicate what type of evidence 
leads him to the assertion—whether a mathematical proof, his experience in statisti- 
cal analysis, or a hunch. 

A simple example of “correlated errors” occurs when we replace ¢; in (1) by a mov- 
ing average u,-+-au;, where the u’s are independent. The author states that in this 
case r;/r, provides a large-sample estimate of p of highest efficiency. This statement 
also is incorrect. Here the author suggests some evidence, namely, the asymptotic 
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theory when the e«, are independent. In my opinion he is led to making wrong state- 
ments based on faulty evidence in this and in other instances because he is not con- 
vinced that rigorous and complete proofs are necessary. One other kind of model is 
mentioned later in this chapter in which z; is a moving average of a finite number of 
independent random variables; the author incorrectly states that the C’s of lowest 
order provide a set of sufficient statistics. 

This detailed discussion of five pages should serve as a warning to the prospective 
reader. Although the frequency of errors and confusion is substantially less in the 
remainder of the book than here, it is enough so that it is impossible to correct and 
clarify them in this review. (I found 73 typographical errors in these 105 pages.) 

In spite of all its shortcomings, this book will be of interest to mathematical statis- 
ticians doing research on these finite-parameter models for stationary vector proc- 
esses, for it contains a number of interesting ideas. Perhaps the most interesting new 
material deals with “canonical variables.” The first-order multivariate stochastic dif- 
ference equation is given by (1) where x, and e are vectors and p is a matrix. The 
canonical variables are linear combinations of the components of x; defined by the 
characteristic vectors of the matrix p, and the individual behavior of each canonical 
variable is a function of the corresponding characteristic root. Higher-order differ- 
ence equations also have associated canonical variables, but they are not so easily 
interpreted. The author uses these ideas in the statistical analysis of the artificial 
series and the practical example. 

The analysis of time series studied here is in terms of relations between variables 
rather than in terms of frequencies leading to spectral analysis. This approach, par- 
ticularly of “structural relations,” has been of interest to econometricians. However, 
there is only a brief reference to the studies of econometricians, namely to “identifica- 
tion conditions” in Cowles Commission Monograph No. 10 (and this reference is mis- 
leading and confused). The author’s point of view differs from that of many econo- 
metricians in that his is more exploratory, while the econometrician is more apt to 
use some economic theory to restrict the models. It can be expected that the pro- 
cedures proposed by the author will be useful in various fields, and it is to be hoped 
the ideas presented here will be developed further. 


Statistical Design. W. J. Youden. Washington, D. C.: American Chemical Society Applied 
Publications, 1960. Pp. 68. $2.00. Paper. 


K. A. BROWNLEE, University of Chicago 


n 1954 W. J. Youden was asked by the Editor of Industrial and Engineering Chem- 
I istry to contribute bimonthly a short article on “Statistical Design.” This volume 
contains reprints of 34 such articles from the years 1956-1959 (W. S. Connor has con- 
tinued the series). The space available per article appears to have been about 1800 
words, and it is a severe challenge to any statistical writer to turn out every other 
month an article of this length that would be (a) interesting enough so that the chem- 
ists and chemical engineers for whom it was intended would read it, (b) approximately 
within their comprehension, no statistical prerequisites being implied, and (c) not 
statistically vacuous. 

A chemical engineer is, of course, involved in a far wider range of activities than 
the layman’s idea of a chemist. Such topics as the abrasive wear or tensile strength 
of rubber, radioactive counting, fouling of spark plugs, the baking of cakes, or the 
homogeneity of alloys are part of the chemical engineer’s domain, as well as the 
more traditional areas of analytical chemistry. 

The kind of techniques that Youden finds appropriate to bring to the chemical 
engineers’ attention include balanced incomplete blocks, analysis of variance, ran- 
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domization, linear regression (with an ingenious geometric interpretation), complex 
sampling, weighing designs, fractional replication, Bechhofer’s work on finding the 
best treatment in a group, and so on. 

No one article has the space to go deeply into any topic, and this collection makes 
no pretense at being a textbook, or even a handbook. The purpose was to show the 
chemical engineer that statistical design can be of use to him; for the chemical en- 
gineer to make any actual application in his own work, it would be necessary for him 
to learn a great deal more or to consult a statistician expert in the field of design. 

Youden succeeded in his assignment brilliantly, and anyone so unfortunate as to 
become committed to such a task should study (and plagiarize) his efforts with the 
greatest care. 


Theory of Value: An Axiomatic Analysis of Economic Equilibrium. Cowles Foundation 
Monograph 17. Gerard Debreu. New York: John Wiley & Sons, Inc., 1959. Pp. ix, 114. 
$4.50. 


Rosert H. Srrorz, Northwestern University 


HIs modern presentation of the Theory of Value deserves the highest praise. The 
f ipeaewerns, consists in a rigorous, axiomatic, and formalistic analysis of the 
nature of producer behavior, of consumer behavior, of general equilibrium, and of an 
economic optimum. 

In the opening chapter Debreu presents, starting from scratch, “all the mathe- 
matical concepts and results” which he later uses. The chapter is concise and system- 
atic. It deals with sets, functions, correspondences, preorderings, the real number 
system, continuity, vector sums and products, and fixed points. As is evident from 
this list of topics, emphasis is placed upon the theory of sets and topology rather 
than upon the calculus, in line with the modern developments in economic theory. 
Those who would like to be exposed to the mathematical tools now coming into vogue 
in economics will find an excellent introduction in this chapter. 

In keeping with the formalist character of the book, the concepts of commodities, 
prices, producers, consumers, etc. are defined in an abstract manner and essentially 
in terms of their mathematical properties. There is good reason to do this for it pro- 
vides the analysis with great generality and both widens and delineates the possi- 
bilities for other special interpretations of these concepts. The final chapter on Un- 
certainty is, in fact, a case in point. Here the concept of a certain commodity is ex- 
tended to that of an uncertain commodity by making events, like locations or dates, 
serve also as identifying characteristics of commodities. Similarly for the associated 
concepts of prices, production sets, etc. Once this extended interpretation of these 
concepts is shown to satisfy the conditions of the original definitions, the earlier 
theorems can quickly be stated still to apply. 

The major chapters of the book deal with equilibria and optima. The chapter on 
Equil!brium is concerned with conditions for the existence of an equilibrium. The 
major limitations of economic significance are the exclusion from the analysis of 
problems of increasing returns in production, of externalities in production and con- 
sumption, and of monopoly elements. The analysis, moreover, does not incorporate 
money as a store of value, though it does proceed in terms of money prices. Questions 
of the uniqueness of equilibrium and of its stability are also omitted. All these limita- 
tions are explicitly recognized by the author. 

The chapter on an economic optimum sets forth conditions under which, relative 
to a price system, an equilibrium is a (Pareto) optimum and an optimum is an equi- 
librium, 
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There is really nothing in this book concerned with relating the formal theory to 
reality or with appraising the usefulness of the analysis for positive economics in the 
sense of prediction; the author’s concern is exclusively with the formal logic of the 
analysis. This is not a criticism. It is simply to note what the book is and is not about. 
The problems raised, though formal in character, are natural problems. Once they 
are rigorously stated, economists cannot help but want them to be rigorously 
answered. Once we become aware that counting equations and unknowns cannot 
settle the existence question, we must welcome a contribution that does help to 
settle it. 

It is not evident to this reviewer that the present book would be of any immediate 
interest to the statistician per se. But any statistician having a side interest in the 
economist’s study of the logic of preference structures, decentralized decision mak- 
ing, and social optimality will find that this book gives an excellent account of much 
of the subject. 


The Quality and Economic Significance of Anticipations Data. National Bureau of Eco- 
nomic Research. Princeton, New Jersey: Princeton University Press, 1960. Pp. ix, 466. 
$9.00. 


Ricuarp R. Netson, Carnegie Institute of Technology 


5 gen is an interesting and, on the whole, a quite useful volume. The essays deal 
with several facets of “anticipation data”; data collected by survey methods on 
the plans and expectations of various sorts of economic units. The essays discuss the 
various series that are being collected, introduce some new ones, attempt to spell out 
the developing theoretical framework within which these data can play a role, and 
attempt to evaluate the usefulness of these series to economic analysis and forecast- 
ing. 

Three such surveys are, by now, old friends to many economists and statisticians: 
the Michigan Survey Research Center survey on consumer purchase plans, anticipa- 
tions, and attitudes; the Securities and Exchange Commission-Department of Com- 
merce survey of business investment plans; and the McGraw-Hill survey on the 
same topic. An excellent article by Arthur Okun examines their usefulness in fore- 
casting the components of GNP. Okun finds that, particularly when the survey data 
are artfully combined with other data, they are quite useful inputs to forecasting 
models. It seems clear that these anticipation series will play an increasingly im- 
portant role in forecasting and analysis of our economic system. 

Three quite interesting new (or previously untapped) series are introduced in the 
volume. Thomas Juster examines the Consumers Union Spendings-Intentions sur- 
vey, and finds that in several ways it may have advantages over the Survey Re- 
search Center data. Morris Cohen introduces a promising new survey on business 
capital appropriations being collected by the National Industrial Conference Board. 
This series may provide useful information on longer run investment plans than are 
covered in the Commerce-SEC, and McGraw-Hill surveys. James O’Leary discusses 
a series on the forward commitments of Life Insurance Companies. 

The volume also contains well done background essays by Charles Holt and Henri 
Theil on the use of forecasts in decision making. (Theil’s essay contains a neat, con- 
cise statement of his recent work on the value of forecasting.) There are several essays 
on why expectations are what they are, a fascinating and important topic, but these 
essays did not do much to enlighten this reviewer. However, Millard Hastay’s essay 
contains some quite interesting statistical work. There are essays on the value of an 
index of consumer attitudes, and some strong, if well mannered, disagreements on 
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the matter. The reviewer found the Robert Levine, Dexter Keezer et al., and Murray 
Foss-Vito Natrella essays which describe and attempt to explain the extent of fulfill- 
ment of capital expenditure plans, all quite interesting. 

One of the more striking aspects of the book is the relatively high degree of con- 
cord among a large percentage of the authors about how anticipations data might 
best fit into predictive models of behavior. There is, in general, a clean line drawn 
between the following two types of anticipations: (1) plans for action, (2) expecta- 
tions with respect to variables outside the decision makers control. And explicitly, 
or implicitly, the following model, drawing heavily from the earlier work of Hart, 
Modigliani, Bellman, and others, seems to stand out. 

The context is that of a multi-stage decision problem. The individual, or organiza- 
tion, is viewed as attempting to maximize (or at least do as well as he can) over a 
many period horizon. At any given time, t, the individual’s position is in part a func- 
tion of his past actions, in part a function of external factors. Given his position, and 
given his objectives, he must make a decision about what to do at time t. This deci- 
sion must be made on the basis of the data the decision maker has available about 
what the state of the world is, and what it is likely to be in the future. The decision 
maker must consider what the state of the world is likely to be in the future before 
he makes his decision today, because his decision today will affect his position tomor- 
row. And because of this, a decision today is made in the context of a tentative plan 
with respect to future decisions. 

In general, it may be expected that if the decision maker’s anticipations about 
what the world will be tomorrow turn out to be correct, his planned decision for to- 
morrow will be the action he actually takes. But if his anticipations turn out to be 
incorrect and his actual position differs from his anticipated position, he is likely to 
find it advantageous to diverge from his planned action. Thus the decision maker’s 
action at any time, t, is a function of (1) his previously planned action, (2) his antici- 
pated position, and (3) his actual position. More explicitly, actual purchases of dur- 
able consumer goods might be a function of planned purchases, expected income, 
and actual income. Actual capital expenditure might be a function of planned ex- 
penditure, expected capital goods prices, and actual capital goods prices. 

Notice that in the above analysis anticipations data (plans and expectations) are 
combined with other data which describe the actual position of the decision making 
unit, in the predicting equation. The work of Juster, Levine, Okun, and Holt clearly 
reflects this position. It is the reviewer’s belief that this framework is a promising 
one. It provides a structure for data gathering and data use which tends to reduce 
the danger of measurement without theory; an ailment which has rendered a good 
share of the economic data we collect, quite worthless. The reader of this volume is 
likely to come away bored by some parts of it, irritated by other parts of it, but gen- 
erally impressed with the quantity and quality of the work which is going on in the 
anticipations data field. 


Interindustry Economics. Hollis B. Chenery and Paul G. Clark. New York: John Wiley and 
Sons, Inc., 1959. Pp. xv, 345. $7.95. 


O. H. Brown ez, University of Minnesota 


B INTERINDUSTRY economics the authors mean analysis of the relationships 
among industries as consumers of each other’s outputs and of basic resources and 
as suppliers to consumers. The analytical model most frequently employed in this 
book is input-output analysis, although the more general activity analysis model 
also is described and employed in one intéresting example. 
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In input-output analysis an industry is assumed to employ in fixed proportions 
certain outputs of other industries, (perhaps) some of its own output and certain 
basic resources (such as labor). Demands of product for final consumption usually are 
considered to be exogenous or to contain exogenous components. Thus, given a 
matrix of input-output coefficients (determined by technological considerations) one 
can determine (1) the various finai bills of goods that could be produced from any 
bundle of resources and (2) the amounts of basic resources and the levels of opera- 
tions of each industry required to produce any final bill of goods. 

Input-output analysis assumes that for a given technology each industry has only 
one way to produce. If there is more than one production method available to an 
industry, and there is ‘ixed proportionality among inputs and outputs for any 
method, an activity analysis model can be employed. 

This book describes the basic analytical techniques and how the required data for 
estimation usually are obtained, discusses the basic assumptions of input-output 
analysis—particularly the assumption of fixed proportionality and the consequent 
invariance of production to factor price changes—and describes some applications. 

More lucid descriptions of the models than are provided in this book are available. 
Also the limitations of input-output analysis have been pretty thoroughly discussed 
elsewhere. However, a person marooned on a desert isle with only this book could 
gain an understanding of input-output analysis and some computational procedures. 
His understanding of activity analysis would be less complete. 

Competitors, for forecasting purposes, with the general equilibrium type models 
described in this book are so-called “naive models” (such as assuming that the out- 
put of an industry is a fixed proportion of the final demand for the products of that 
industry in some “base year”) and regressions of industry outputs on other economic 
variables, using time series data as sample values for the estimates of regression 
coefficients. Both of the competitors are less expensive than input-output models. 
Because of changes in the input-output coefficients over time, it is not at all clear 
that strict mechanical application of the input-output technique is superior to the 
multiple regression technique for certain forecasting purposes. However, knowledge 
of changes in the input-output coefficients could be incorporated into the projec- 
tions. 

Several examples of input-output projections—some from studies with which the 
authors were associated—are described. In addition a linear programming formula- 
tion of a model for an economic development program is presented. In the expository 
model, the activities consist of importing, producing for domestic consumption and 
producing for export each of three classes of commodities—finished goods, agricul- 
tural commodities and basic industrial commodities—; producing services and dis- 
posing of resources. The restrictions are the available labor, a fixed supply of foreign 
exchange and fixed domestic commodity requirements. The objective function is the 
use of capital, which is to be minimized. The solution yields that production, import 
and export pattern minimizing capital use. The solution to the dual yields the prices 
for labor, foreign exchange and the commodities. The results of an application of 
such a model to date for Southern Italy are presented. Henderson’s coal study and 
Fox’s feed grain shipments analysis are among the other linear programming applica- 
tions that are described briefly. 

An economist not familiar with input-output or linear programming, but in- 
terested in problems of economic development or trade should find this book interest- 
ing and useful. Its use of statistics (inference or estimation) is virtually nil; but statis- 
ticians who are data collectors will get many ideas on what data are useful for studies 
of development and structural forecasting. 
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The Theory of Linear Economic Models. David Gale. New York: McGraw-Hill Book 
Company, Inc., 1960. Pp. xxi, 330. $9.50. 


Date W. Jorgenson, University of California, Berkeley 


HE menu from which a student of linear programming may select a textbook is 
Puss extraordinarily rich. Hardly a month goes by without the appearance of a 
new book on the subject. Unfortunately, most books now available are oriented to 
special fields of application—business and farm management, operations research, 
economic theory. To get to the mathematical meat of the subject it was necessary to 
go directly to the original research papers until the appearance of David Gale’s new 
book, The Theory of Linear Economic Models. 

Perhaps the best review of the book is provided in a useful eleven-page preface to 
the main text written by the author; this preface should be consulted for a detailed 
discussion of the contents of the book. Gale has set out to write a textbook covering 
topics in programming, games, and mathematical economics which can be treated 
using only the algebra of finite dimensional vector spaces and the theory of linear 
equations and inequalities in such spaces. The book is mathematically self-contained, 
but should probably be preceded by at least two years of undergraduate mathe- 
matics. A course in which Gale’s book is used as the principal text will fit easily into 
a mathematics or statistics curriculum at the junior level, possibly preceded by a one 
semester course in linear algebra. Since no use is made of the calculus, great flexibility 
is permitted in formal preparation for the course. This fact makes the book especially 
suitable for use in graduate programs in various “applied fields” such as economics, 
business administration, and industrial engineering. Advanced students with under- 
graduate training in engineering or a good stiff year of mathematics before beginning 
the course will find no difficulty in handling the mathematical aspects of Gale’s book. 

A brief introduction is followed by a comprehensive treatment of topics in real 
linear algebra required for development of the theory: vectors and matrices, linear 
equations and inequalities, geometrical treatment of linear inequalities, convex 
cones, convex sets and polytopes, extreme vectors and extreme solutions. The discus- 
sion of linear algebra is followed by a rigorous but mathematically elementary treat- 
ment of the theory of linear programming. A chapter on the simplex method for com- 
putation of solutions to linear programming problems and another on “integral” 
linear programming, by which Gale means the transportation problem and related 
problems, not “integer” programming, complete the discussion of linear program- 
ming. Two additional chapters are devoted to game theory and the book is con- 
cluded with two chapters on other linear economic models, specifically, “input-out- 
put” models and the vonNeumann model of an expanding economy. 

Each chapter is accompanied by a well-designed list of exercises. A wide range of 
examples of linear programming problems is given in the introduction and the two 
chapters on simplex method and transportation models. In discussion of the principal 
mathematical results—existence of a solution to a linear programming problem, the 
duality theorem, complementary slackness, existence of basic solutions and conver- 
gence of the simplex method of computation, vonNeumann’s minimax theorem, exist- 
ence of non-negative solutions to open static input-output systems, and existence of 
equilibrium for the vonNeumann model—proofs classic in lucidity and conciseness, 
based in part on previous work of the author (with collaborators), are presented. A 
number of interesting new proofs are given, especially in the two chapters on other 
linear economic models. Throughout, the exposition is carefully motivated by an 
appeal to the economic interpretation of each problem, mainly in terms of the theory 
of a competitive economy. The principal results are reduced to elementary terms 
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with little or no sacrifice in directness or brevity. As one example, in discussion of the 
existence of a non-negative solution for the open static input-output system, no use 
is made of Frobenius’ theorem or of the notion of characteristic roots of a matrix; 
the whole topic, including non-negativity of the inverse of a Leontief-Minkowski 
matrix, occupies a mere five pages of text. As a consequence of the author’s care in de- 
veloping standardized notation and terminology and his skill in unifying the underly- 
ing theory, the student is forcefully presented with a number of the characteristic 
features of modern applied mathematics—-rigor, parsimony, elegance, scope of ap- 
plications. In this respect, Gale’s book may be compared with Feller’s treatment of 
discrete probability theory and its applications.' 

A few words to economists may be in order at this point. The level of mathematical 
difficulty of this book is not much greater than that of the well-known text by Dorf- 
man, Samuelson, and Solow.? While the latter will probably be more useful in inter- 
mediate and advanced courses in economic theory, Gale’s book will most likely be 
preferable as the primary text on linear economic models for any course in “mathe- 
matical methods” or “mathematical approach” to economic theory. 

For the truly mathematically mature, this book is probably less satisfactory than 
the recent treatise of Karlin,? which covers some of the same ground, but from a 
much more advanced viewpoint. The objective of the two books is essentially the 
same—to unify mathematical theory in parts of economics, programming, and games. 
While Karlin’s work is truly comprehensive and provides an introduction to the 
field for research workers in mathematics, Gale’s objectives, as outlined above, are 
quite different. These objectives are attained with rare expository skill, for which 
students (and teachers) in both “applied” and “pure” mathematics will be grateful. 


Federal Receipts and Expenditures During Business Cycles, 1879-1958. John M. /’ire- 
stone. (A Study by the National Bureau of Economic Research.) Princeton: Princeton 
University Press, 1960. Pp. xvi, 176. $4.00. 


Orro Ecxste1n, Harvard University 


IRESTONE has applied the National Bureau business cycle analysis to time series 
Fo the Federal Budget. He has constructed monthly time series, 1879-1958, for 
expenditures, total receipts and five components of receipts, seasonally adjusted 
them, and studied their behavior during the twenty business cycles since 1897. 

The concepts of revenue and expenditure used—dictated by the availability of a 

.long, continuous series—are the figures given in the Daily Treasury Statement, 
which are the cash deposits and withdrawals of the general account of the Treasurer 
of the United States. They do not correspond to the “cash budget,” the major dif- 
ference being inclusion of intra-governmental transactions and exclusion of trust 
fund receipts and expenditures. 

The twenty cycles were split into five groups for analysis, ten before World War I, 
four interwar, three war, and two postwar cycles, plus 1954-58. Generally, the bud- 
get conformed to the cycle in the way theory suggests, with rising cash surpluses in 
prosperity and deficits in depression. Most of the cyclic variation in peacetime is on 
the revenue side, in customs receipts in the earlier cycles, income taxes later on. 
Expenditures have risen on an irregular trend. In war, the surpluses have shrunk 
(or deficits risen) as expansion has proceeded. The main text of the book analyzes 





1 Feller, W., An Introduction to Probability Theory and Its Applications, Vol. I, 2nd edition. New York: Wiley, 
1957. 

2 Dorfman, R., Samuelson, P. A., and Solow, R. M., Linear Programming and Economic Analysis. New York: 
McGraw-Hill, 1958. 

* Karlin, 8., Mathematical Methods and Theory in Games, Programming, and Economics. P eading, Massachusetts: 
Addison-Wesley, 1959, 2 vols. 
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the behavior of the series during the five groups of cycles, testing for cyclic conformity 
and describing some peculiarities of individual cycles. This material is of historical 
and analytic interest to any student of government fiscal policy. 

The study finds that the cyclic conformity of revenues, particularly in peacetime, 
is almost wholly due to automatic revenue changes, not changes in the rates. In fact, 
out of 101 tax and tariff changes in peacetime, 64 were out of phase, 37 tax cuts dur- 
ing expansion, 27 rate increases during contraction. One wishes this analysis had 
been pushed further: has conformity increased over time? Were some of the “wrong” 
changes near the end of expansion or contraction and therefore beneficial in effect? 
How many were changes in taxes levied for non-revenue purposes? In the case of 
tariffs, shouldn’t a rate increase be considered stimulating in the short-run? And 
were not some of these changes correct cycle policy despite their sign? Crude inspec- 
tion of the data does not suggest that these factors account for most of the perverse 
cases. But more than a simple tally is required to establish so fundamental a point 
about discretionary tax changes. 

The book has the virtues and the vices of the strict National Bureau approach to 
business cycles. It makes a new time series, about an important economic unit, avail- 
able to other scholars. It is a model of historical statistical scholarship. The methods 
and sources for constructing the series are set forth so clearly that someone else could 
reproduce the results. (How one wishes the statistical agencies of the Federal govern- 
ment did the same!) The study of cyclic timing yields some interesting, though few 
surprising, generalizations, and provides a framework for discussing the deviations 
from the pattern in particular cases. 

But the approach also has its faults, some of which become particularly serious 
in the case of government receipts and expenditures. First, since the movements of 
these series are not the result of the decisions of many individual economic units, as 
in the private economy, the analysis should separate discretionary changes, particu- 
larly in tax rates, from induced changes. While useful data on the direction of change 
in tax rates are presented, they are not part of the main analysis of the time series. 
Second, too much is dictated by data availability. The Daily Treasury Statement 
concept, while available with adjustment for a long period, is not what one would like 
for modern theoretical analysis. It does not correspond closely enough to the cash 
budget, the public accounts in the national income statistics or the administrative 
budget. These figures, particularly in the short-run, do not always move in the same 
pattern as the other concepts, and hence are difficult to interpret. Also, the revenue 
subdivisions lump personal and corporate income tax collections together, yet a busi- 
ness cycle analysis requires knowledge of the two components. Third, the definition 
of cyclic conformity understates the conformity of expenditures. Because they were 
on an upward trend, they did not conform properly in some upswings. If trend were 
removed, more conformity might be found. 

Fourth, while the grouping of business cycles into five categories eliminates the 
worst of the drawbacks of averaging reference cycle experience, too much diversity 
remains in some cases to make cycle averages useful units of analysis. The four inter- 
war cycles are dominated by 1927-33, so the averages have little meaning. The series 
in the two cycles following the world wars have, in fact, little in common in timing, 
direction or magnitude, other than the fact that expenditures fell for a while both 
times. 

These difficulties aside, the book represents a useful addition to our knowledge of 
public fiscal affairs. The series can be used in other analyses, if their conceptual 
limitations are kept in mind, and the results of the analysis for the twenty cycles, 
apart from their historical interest, can provide a perspective on current conditions. 
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Family Planning, Sterility, and Population Growth. Ronald Freedman, Pascal K. Whelp- 
ton, and Arthur A. Campbell New York: McGraw-Hill Book Company, Inc., 1959. Pp. 
xi, 515. $9.50. 


Harowp F. Dorn, National Institutes of Health 


ONFIDENCE in the accuracy of population forecasts or projections as they were 
C often rather euphemistically called became widespread during the decade of the 
1930’s. As the birth rate continued its downward trend during the early years of this 
decade, the possibility that the total population of the United States might cease to 
increase rapidly was accepted as highly probable, indeed the belief that the size of 
the population might reach a maximum before the end of the century, after which it 
would slowly decline was publicly endorsed by many demographers. 

The rise in the birth rate that began during the mobilization prior to World War II 
and the “baby boom” following the armistice at first were regarded merely as tempo- 
rary interruptions in the long trend toward continually smaller families. But as the 
increased annual number of live births persisted year after year, demographers and 
non-demographers alike began to doubt the reliability of the former population fore- 
casts. Finally even the most stubborn demographers conceded that the downward 
trend in the size of completed families probably had ceased and that the population 
of the United States, instead of decreasing, almost certainly would increase rapidly 
during the last half of the present century. 

This experience stimulated a search for new concepts and methods for analyzing 
and projecting fertility and population trends. This book reports the preliminary 
results of an investigation designed to unravel the extent to which the deliberate 
planning of family size may account not only for long time trends in the completed 
size of family, but also for temporal fluctuations in child spacing during the years of 
family formation. 

The study had three principal objectives: (a) to ascertain the influence of impaired 
fecundity and contraception on family size, (b) to determine number of children 
wanted and expected by married couples of childbearing ages, and (c) to use this 
information to provide more reliable forecasts of the number of births as a basis for 
estimating the future population of the United States. 

Data were collected in 1955 by the staff of the Survey Research Center of the 
University of Michigan by interview with 2,713 white married women between 18 
and 39 years of age either living with their husbands or temporarily separated be- 
cause the husband was in the armed forces. The women interviewed were selected by 
an area probability sample. Since many of the newborn babies during the next two 
decades will be borne by women who were single in 1955, 254 white single women 18 
to 24 years of age also were interviewed. This book reports only on the data obtained 
from women who were married in March 1955. 

The cooperation of the women was excellent. Interviews were completed for 91 
per cent of those selected. Only 5 per cent refused to cooperate, the remainder could 
not be located, were too ill to be interviewed, or were away from home. 

The authors believe that the women answered honestly and seriously. Alternative 
phrasing of questions was used to overcome the problem of reluctance to admit the 
use of birth control or of varying concepts of whether specific practices, e.g. douching, 
were methods for limiting the number of children. 

A comparison of characteristics of the women interviewed with corresponding data 
from census statistics indicated that the sample was satisfactorily representative of 
the appropriate total U. 8. population. The replies to certain factual questions, e.g. 
number of children, agreed reasonably closely with independent data from other 
sources. 
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There is no way of checking the validity of the respondents statements concerning 
the ultimate number of children they expect to bear. Ultimately it will be possible to 
compare the actual number of births with the number forecast based on the replies 
given by the women in this sample. This comparison will show whether women tend 
to overstate or understate the size of their completed families but it will not reveal 
whether any observed difference is due to deliberately false reports, lack of a clear 
ultimate objective, failure to carry out expressed wishes, or a subsequent change in the 
desired size of family. 

Data are presented concerning the sterility and fecundity of American families, 
and prevalence of the use of methods to regulate conception; the relationship be- 
tween social and economic factors such as religion, education, income, occupation, 
employment of wife, and community background and efforts to plan the size of fam- 
ily; attitudes toward family limitations; methods used to control fertility and their 
effectiveness; the relative influence of physical impairments and family planning upon 
the size of family; fertility of various social and economic groups; and the expected 
family size by the end of the childbearing period. These data are used to project the 
future trend in family size between 1955 and 2000. From these results the authors pre- 
pare population projections for the same period. 

Five principal conclusions are drawn from the analysis of the information ob- 
tained from these 2700 wives; (a) although impairment of fecundity is very common 
it is not important in determining population trends, (b) family limitation is approved 
and practiced effectively by the vast majority of white couples, (c) all groups of the 
American population are approaching a common set of values with respect to size of 
family, (d) the two-to-four child family is widely accepted as ideal, and (e) if present 
family growth plans are continued and realized, the American population will grow 
rapidly! (What a change from the projections of the 1940’s.) 

This is a pioneering study. It was carefully planned and carried out. The data have 
been competently analyzed and the important results clearly presented. For the first 
time, reliable knowledge is available concerning the prevalence of family planning in 
the population of the United States and the desired size of family. These data are 
valuable in themselves and are worth the effort to assemble them. 

But just as moths are attracted to a bright light, so are demographers fascinated by 
population projections. The estimates of future population published in this book 
do not differ radically from those previously made by the Bureau of the Census with 
more limited data. However, when the actual population of the future differs from 
the projections published here, the authors will have more carefully documented why 
they were wrong. 


Theory and Methods of Scaling. Warren S. Torgerson. New York: John Wiley and Sons, 
1958. Pp. xiii, 460. $9.50. 


Josernu L, Zinnes, Indiana University 


CALING methods as defined in this book are procedures for constructing scales for 
the measurement of psychological attributes. Scales themselves seem to be the end 
product of a measurement procedure and measurement, we are told, involves the 
assignment of numbers to objects. This book then is concerned with procedures for 
assigning numbers to objects, not, in contrast with other psychometric tests, with 
procedures for determining thresholds, just noticeable differences, points of subjec- 
tive equality, or psychophysical laws. 
The scaling methods covered are restricted to those which are “fundamental.” 
These are methods “whose general rationale involve the construction and application 
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of a self-contained testable theory” (p. 418). Measurement by definition or by a 
derived process is specifically excluded. Excluded, therefore, is the problem of select- 
ing and weighting indicants and the mental testing or individual-difference methods. 
The latter are excluded because they involve at the most a weighted sum of items 
answered correctly and this, it is argued, is essentially measurement by definition. 
One direct consequence of treating fundamental methods exclusively is that the 
notion of goodness of fit is, or ought to be, applicable to the methods which are cov- 
ered. 

In addition to the “fundamental” criterion two other explicit criteria govern the 
author’s selection of scaling methods. Firstly, the methods must be general, not 
solely applicable to « single attribute or context. The problem of estimating param- 
eters within a given theory is therefore excluded. And secondly, the methods must 
yield at least an ordering of the objects (i.e., an ordinal scale) so that classification 
procedures or measurement procedures leading to a nominal scale are also excluded. 
Within the area mapped out by these self imposed restrictions there appear to be 
further omissions. The huge literature on scaling utility, for example, is completely 
ignored although much of this literature has both direct and broad implications for 
sealing theory. 

As to the methods considered, these are divided into two main categories: the 
stimulus centered or judgment methods, in which the response variability is attrib- 
uted to differences between the stimuli, and the response methods in which the re- 
sponse variability is attributed to both subjects and stimuli. The judgment methods 
described in Chapters 4-10 include the quantitative judgment methods (e.g., bisec- 
tion and fractionation methods) and methods based on the Thurstone model. These 
Thurstone methods together with their multidimensional extensions (covered in 
Chapter 11) take up nearly one half of the 357 pages that are devoted to scaling 
methods. The response methods described in the remaining three chapters include 
primarily the scalogram methods of Guttman, latent structure and latent distance 
models of Lazarsfeld, and the unfolding methods of Coombs. 

The basic format of each chapter is essentially the same. The scaling methods are 
discussed under the headings: underlying theory, the experimental procedures for 
obtaining the relevant kinds of data, the analytical procedures for estimating the 
parameters of the theory, and, finally, ways of evaluating goodness of fit. 

The author has it seems at least two basic aims in this book: to provide a clear 
presentation of each method, making the assumptions and character of the method 
evident, and to provide a theoretical framework or basis for comparing and integrat- 
ing the methods. These tasks are not as simple as they might sound. Scaling theorists 
have tended to work independently of each other so that their joint efforts do not 
form an obvious coherent whole. Furthermore, the early scaling methods were de- 
veloped without sufficient mathematical detail and completeness, and the more re- 
cent methods, scattered throughout the journals, can be somewhat formidable in 
appearance. 

Torgerson’s main contribution in this book lies in his handling of the first problem. 
His writing is thoroughly and amazingly lucid without any sacrifice in detail and 
completeness. At the same time the text does not have a cookbook quality about it. 
No magic formulas or prescriptions are offered. The writing is serious and where 
necessary, which is often, conclusions are reached tentatively. The author has for the 
most part even refrained from reporting many of the standard practices of scaling 
theories without first subjecting them to some critical analysis. In short, the tone of 
the book is altogether consistent and fitting with the present tentative state of the 
field. 
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Of course, occasional slip-ups do occur over which one could quibble. There are 
some errors of reporting (e.g., Coombs’ Task A and Task B are incorrectly identified 
as the author’s judgment and response methods, respectively, rather than vice versa) ; 
some inconsistencies in notation (e.g., population and sample values are indicated in 
at least three distinctly different ways at various points of the book); and some un- 
critical descriptions of dubious conventions (e.g., the least squares fitting procedure 
and the goodness of fit test for the Thurstone methods are related without discussing 
the nature of the approximations involved). Despite the fact that this book is not 
wholly free from defects, it is, and probably will remain for some time, the best single 
reference book of scaling methods. Its only major deficiency as a reference book is its 
minor treatment of the Bradley-Terry methods which have recently come into much 
prominence among psychologists, primarily as a result of the work of R. D. Luce. 

The author fares less well with the more difficult problem of providing a theoretical 
basis or organization. Although there is a clear attempt to provide a unified or ab- 
stract treatment of scaling methods, for example, by grouping together different ex- 
perimental procedures which employ the same scaling model, the level of unity which 
is achieved is still fairly minimal. All too often the distinctions which are made appear 
arbitrary and tend to emphasize what is conventional rather than what is basic. The 
Thurstone theory is described as a judgment method although it could equally well 
be employed as a response or subject centered method. In fact much the same could 
be said for most of the scaling models which are classified in one way rather than 
another. Furthermore the properties which are attributed to these various classes 
of scaling methods do not appear to be intrinsic to them. The judgment method is 
characterized as a method in which the attribute is clearly specified a priori, although 
in the case of the multidimensional methods this is certainly not the case. Also the 
fact that two long chapters (Chapters 9 and 10) are devoted to methods which em- 
ploy the same scaling model does not help to increase the general level of abstrac- 
tion. 

More serious than the lack of unity is the absence of an adequate scaling model in 
many cases. This is particularly true of the three chapters on quantitative judgment 
methods. No theory is developed which rigorously justifies the numerical assign- 
ments obtained by these methods. It is for this reason that the goodness of fit tests 
are so difficult to formulate. It is not obvious what properties these numerical assign- 
ments must satisfy to constitute a scale and hence what properties can form the basis 
of a goodness of fit test. The list of properties which the author gives for each scaling 
method does not distinguish between those properties which make the scale stable 
(and hence desirable) and those which are in fact necessary for the existence of the 
scale. (In one instance a scale is ruled out because no properties could be found which 
could form the basis of a goodness of fit test although this scale of interpoint dis- 
tances would nevertheless prove useful if, in a particular context, it led to an Eu- 
clidean space of few dimensions.) 

Apart from the lack of adequate scaling models underlying the quantitative judg- 
ment methods, what is needed is a general discussion of the nature of scales, a discus- 
sion that highlights the essential elements involved in the construction of a scale or in 
determining the existence of a scale. The general discussion of science and measure- 
ment in the first three chapters do not, as one would hope, perform this function. 
There is little connection between the general considerations in these early chapters 
and the scaling methods described in the later chapters. 

By and large the deficiencies of this book are those which are characteristic of the 
scaling field. One cannot refrain from hoping that the major impetus of this book 
will be in directing scaling researchers to these gaps of knowledge and in goading the 
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researcher into the job of filling them in, both at the top and at the bottom, rather 
than in providing a handy file for the practitioner. 


The Strategy of Conflict. Thomas C. Schelling. Cambridge, Mass.: Harvard University 
Press, 1960. Pp. 309. $6.25. 


ANATOL Rapoport, University of Michigan 


Wo Francis Bacon wrote “Knowledge is power,” he was saying something 
startling to the people of his age, for it had not occurred to many that the ability 
to control the environment involved a knowledge of the environment. Today the 
dictum is a truism, so much so that the power to control is seen as the only aim and 
justification for seeking knowledge. Accordingly, the scientist in our age enjoys an 
unprecedented prestige, not, however, in his capacity to increase man’s wisdom but 
only for his presumed ability to bestow power. 

This conviction has been constantly reinforced by the mounting triumphs of 
physical science which has solved the major problems of controlling the physical 
environment. Now that the problem of controlling the human environment looms 
enormous, the scientist is called upon with increasing insistence to provide solutions, 
and the solutions he is called upon to provide are expected to be of the same sort as 
the solutions that made control of the physical environment possible. 

The well-known mode of the scientific statement is of the form “If so, then so.” 
Accordingly if the “if” can be brought about, the “so” will follow. The extension of 
this framework to the probiem of controlling human behavior would be feasible, it 
seems, if similar causal relations were established, connecting controllable conditions 
with determinate behavior. Granted that in many areas such conditions may exist, 
there is a broad class of situations where they do not, namely where those whose be- 
havior is to be controlled themselves enjoy a degree of control over the state of affairs 
and moreover have interests opposed to the would-be controller. 

It is not surprising, therefore, that when the theory of games sprang practically 
full grown from the minds of J. von Neumann and QO. Morgenstern and was enthusi- 
astically hailed by mathematicians for the richness of its mathematical concepts and 
by “hard-headed” social scientists for its rigorous formulation of rational conflict, 
hopes rose sharply among those professionally involved in conflict (managers of 
competing economic units, military strategists, etc.). Here at last was a new logical 
apparatus which stood in the same relation to problems of conflict as the logic of 
physical science had stood in relation to the problem of controlling nature. To the 
professional, whose business is the design of strategy and who has sufficient under- 
standing of the importance of fundamental theoretical research, game theory ap- 
peared as the basic theoretical framework for discovering the fundamental laws of 
strategic conflict. 

With regard to the theory of the two-person zero-sum game, there is hardly any 
question that the Fundamental (Minimax) Theorem of von Neumann does provide 
such a foundation for a theory of conflict. That is not to say, of course, that in any 
situation which can be conceived as a two-person zero-sum game, the “best” decision 
can always be ground out on a computer. The hyper-astronomic number of strategies 
involved in any but trivial games, the difficulty of estimating utilities and the 
artificial nature of mixed strategy make the application of “the” minimax solution 
all but impossible in most real contexts. But the difficulties are technical, not con- 
ceptual. Just as in meteorology, where the physical laws governing all the factors 
which determine weather are known, the problem of prediction has been reduced to 
practical problems of increasing the accuracy of observations and the speed of compu- 
tations, so in zero-sum conflict situations, the problems are technical] and practical 
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rather than conceptual. In the theory of conflict of diametrically opposed interests 
the conceptual problem has already been solved. 

But this is not true of other kinds of conflict, e.g., essential games with more than 
two players and non-zero-sum games. Here the game theorist is plagued with para- 
doxical and indeterminate solutions, familiar to every one who has examined game 
theory beyond the two-person zero-sum game. 

Already von Neumann and Morgenstern recognized that extra-game-theoretical 
concepts must perforce enter any investigation which purports to connect the theory 
of games even to normative prescripts of action let alone predictions of actual be- 
havior, whenever the limits of the two-person zero-sum game are transcended. These 
extra-game-theoretical considerations involve the dynamics of bargaining, social 
norms, the role of trust and suspicion, comparison of interpersonal utilities and many 
other topics. All of these topics are excluded from “classical” game theory. 

Dr. Schelling’s book is devoted to the examination of these extra-game-theoretical 
topics. It should not therefore be judged as a contribution to game theory, as a game 
theorist understands it, but as a contribution to the problem of linking game-theoreti- 
cal concepts with other concepts in order to make possible more determinate norma- 
tive recommendations to the decision maker or, at least, to bring to his attention the 
problems which must be attacked if such normative recommendations are to be at all 
possible. 

The principal theme of the book appears to me to be the role of communication 
in situations involving strategic choice. I do not say “strategic conflict,” because 
typically the situations depicted are mixed-motive situations, as Dr. Schelling calls 
them, in which the interests of the players are at least partially (sometimes entirely) 
coincident. Problems of communication had been systematically by-passed in the 
“classical” game theory: in the two-person zero-sum game, communication is of no 
consequence; in coalition games, a coalition of advantage to all concerned is always 
presumed possible, i.e., involves no communication difficulties. 

In real life, however, communication problems are often central in situations with 
multiple control. The simplest such situation is depicted in the anecdote about 
Holmes and Moriarty, each of whom must choose one of two stations to get off the 
train. Whether they are enemies (e.g., if Holmes pursues Moriarty or vice versa) or 
friends (hoping to get off at the same station), each wishes to obtain information 
about the other’s choice. If they are enemies, each wants to conceal this information 
from the other; if they are friends, each wishes to convey it. In either case, the in- 
teresting problem, in Dr. Schelling’s view, is the problem of getting, concealing, or 
conveying information, a problem, as we have said, by-passed in game theory, be- 
cause there it is assumed that either such information is absolutely unobtainable (in 
which the best either can do in either game is to choose a mixed strategy, e.g., flip a 
coin), or the information is available (in which case both of the Holmes-Moriarty 
games collapse into a trivial decision). 

Dr. Schelling, however, chooses to examine precisely cases of this type (involving 
mixed motive games) in which conveying some information would be beneficial to 
both parties (even partial enemies) but where this cannot be done directly. This 
leads him to discussions of so-called “games of coordination” and games of tacit 
collusion, in which psychological considerations play a prominent part. 

I wish to meet some one, and he wishes to meet me, but we cannot communicate. 
“Where would I go, if I were he?” is the first question that comes to mind. But this 
is not enough. He, too, is asking this question. Therefore I must go where he thinks 
I would go if I imagined that I were he. But this carries the process only one step 
back. In principle the chain is infinite. “What would he do if he thought that I 
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thought that he thought, etc ....” This bizarre phrase recurs so many times through- 
out the book that one gets an impression of a powerful whirlpool, which keeps sucking 
the discussion in and from which there is no escape. This stylistic device may irritate 
some readers, but { believe that Dr. Schelling’s emphasis of this cycle, which char- 
acterizes the analysis of so many mixed-motive conflicts is justified, because it is 
lost sight of far too often by decision makers with awesome responsibilities. 

In this connection, it must be pointed out that in at least two situations, the circu- 
lar search for the promising strategy is not a vicious circle in the sense that the search 
does converge on a decision. The first situation is the two-person zero-sum game 
without a saddle point. In a 2X2 game of this kind, if one chooses a strategy on the 
basis of what the other is expected to do and credits him with divining this decision, 
then one is drawn to the opposite strategy as being more advantageous. But if one 
credits the opponent with divining this decision, then the first strategy again appears 
more advantageous. The minimax mixed strategy solution resolves this impasse in a 
way which appears as a natural generalization of the pure minimax strategy which 
occurs in games with saddle points. 

The second situation in which the circular search converges on a decision is in a 
“game of coordination” where it is of advantage to both parties to coordinate their 
choices, even though their interests may be for the most part opposed, in the case 
that a recognizable “distinct choice” exists. For instance if two people are asked to 
propose (independently of each other) how to divide $100, with the understanding 
that this money will be divided as they direct if the two proposals are identical and 
will be forfeited otherwise, most people will propose an equal division. For all pairs 
of proposals which agree, the sum of the pay-offs is constant, hence the interests of 
the players are opposed. But they must agree to get any pay-off, and the 50-50 divi- 
sion is the only “distinctive” choice, quite aside from its apparent “fairness.” As 
Schelling puts it, “If not here, where?” He cites many experiments in which the 
“focal point” of the situation serves to crystallize a tacit agreement. 

In addition he cites what appear to be instances of tacit collusion among comba- 
tants in war. He believes that refraining from using poison gases in World War II 
may have been an example. In this connection he remarks that total abstinence had 
the advantage of being an unambiguous focal point. “Gas only on military personnel; 
gas used only by defending forces; gas only when carried by vehicle or projectile; no 
gas without warning—a variety of limits is conceivable; some may make sense, and 
many might have been more impartial to the outcome of the war. But there is a 
simplicity to ‘no gas’ that makes it almost uniquely a focus for agreement when 
each side can only conjecture at what rules the other side would propose and when 
failure at coordination on the first try may spoil the chances for acquiescence in any 
limits at all” (p. 75). 

We see then that the circular reasoning of the form “He thinks that I think that 
he thinks . . . ” converges in the case of zero-sum games without saddle points on the 
minimax mixed strategy and in the case of coordination games on the “distinctive 
choice.” In these cases a satisfactory way out of the impasse exists. In other situa- 
tions, this is not the case. In the game of “chicken” (whose counterpart in interna- 
tional diplomacy has come to be known as brinkmanship) the question is “Will he 
swerve in time?” The reasoning then goes, “If he will, I need not. So if he knows I 
won't, he will. So I won’t.” The trouble is that the other chain is equally convincing: 
“If he won’t, I must. So if he knows I must, he won’t.” But at least here there is a 
chance for swerving and staying alive, if one should decide that the prize is not worth 
the risk. It is otherwise in the now foreseeable outcome of “mutual deterrence.” As 
soon as the ultimate weapon is at the disposal of both champions of peace (the ulti- 
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mate weapon is one capable of striking the blow so as to make retribution impossible), 
the “way out of the impasse” leads to catastrophe. For in this situation it is inevitable 
that the idea of a pre-emptive first strike is seriously considered. But then the very 
act of considering it makes it clear that the adversary is considering it also. Here too 
the circular chain of reasoning “converges” but not on a decision that bespeaks for 
our sanity: “I don’t want to do it, but he thinks that I do, therefore he may, therefore 
I should, therefore he will, therefore I must!” 

Such are the problems which arise in mixed-motive conflicts, where communica- 
tion is scanty. The presence of ample communication opportunity presents its own 
problems, namely those of bargaining. 

Dr. Schelling examines the role of commitments, threats, and promises. A commit- 
ment is an irreversible choice by one of the opponents (burning the bridge behind) 
who deliberately restricts his own freedom of choice so as to be immune to threats or 
to forced concession. Threats and promises are declarations of future commitments 
contingent on certain acts of the other. Again the role of communication is crucial 
since not only are threats and promises ineffective if they are not believed but some- 
times work to the disadvantage of the one who makes them. This is especially true 
if threatened reprisals are out of proportion to the transgressions which they are sup- 
posed to prevent and if both parties stand to suffer if the threat is carried out, be- 
cause under these circumstances a bluff is likely to be called to the great embarrass- 
ment of the bluffer. If a threat is disregarded, the making good is usually of consider- 
able disadvantage to the threatener. Indeed there seems to be no “rational” reason 
to carry out the threat after it has been disregarded. (Revenge motivations do not 
usually enter into “rational” strategic considerations.) The “making good” of a threat 
can sometimes be defended on the grounds that the threats will be more likely to be 
believed next time. But this argument loses force if there can be no next time. There- 
fore the use of threats involves considerations of the likelihood that they will be be- 
lieved, considerations of the cost of carrying the threat out, the likelihood that the 
threat will be received and understood (a kidnapping is pointless if there is absolutely 
no way of getting in touch with the family of the victim), and the likelihood that the 
demand can be complied with (recall the signs on safes saying that the safe cannot be 
opened except at specified hours). 

Therefore immunizing oneself against threats can take peculiar forms, e.g., de- 
liberately incapacitating oneself to perform what is demanded; making it impossible 
to receive the threat, etc. Many legal measures are designed to provide just such 
immunity. Similarly, deliberately putting oneself in jeopardy may make one’s prom- 
ises more credible. Among the privileges of a corporation is the right to be sued. At 
first one might be surprised to see this listed as a “right” (who wants to be sued?), 
but one sees the point when one considers the implication: a corporation’s promises 
are more credible if the corporation can be sued for not carrying them out. Minors 
do not have the right to be sued. 

Although The Strategy of Conflict abounds in homely, often amusing examples 
illustrating the theoretical considerations with which the author is concerned, it is 
clear that the principal area of application of these principles is intended to be the 
area of international relations. Limited war, surprise attack, disarmament, nuclear 
weapons, etc. are the main topics of discussion. Since the object of the book is a 
logical analysis of the strategy of conflict, it is only proper for the author to be as 
uncommittal as possible on actual policy. His job was to outline criteria for guiding 
and evaluating policies, not to prescribe the actual conduct of international affairs, 
and he stuck to his job. 

Yet the very treatment of the subject leads almost inevitably (for me at least) to 
certain conclusions about the severe limitations of the strategic approach to prob- 
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lems of war and peace. Dr. Schelling himself points out (Chapter 4) that the attempt 
to build a general theory of games around the nucleus of the zero-sum game has 
hindered the development of a general theory. I would take issue with the statement 
as it stands. The theory of games has been developed much beyond the zero-sun 
game, and it is not the fault of the theoreticians that the results are so frequently 
indeterminate or psychologically disturbing. The mathematical theory of games was 
never meant to be a behavioral theory, but only a mathematical one, which examines 
the internal logic of certain situations without necessarily drawing conclusions about 
what this internal logic may imply in human affairs. Nor are the indeterminacies and 
paradoxes of “higher” game theory necessarily disturbing to the mathematician. Re- 
sults intuitively difficult to accept have often been hailed as triumphs of mathemati- 
cal theory (Cf. Cantor’s set theory, Goedel’s results in metamathematiecs, etc.). 

I do agree with Dr. Schelling’s statement, “That game theory is underdeveloped 

... May reflect its preoccupation with the zero-sum game,” if I interpret it to mean 
that the application of game theory to human affairs is not likely to yield significant 
results until the theory is linked with certain psychological, sociological, and even 
ethical or esthetic considerations. It is in this light that I wish to consider the au- 
thor’s theory of coordination and tacit communication which in his opinion plays 
a vital role in human conflicts. 

It is certainly interesting that in some instances people unable to communicate 
will reach a quicker agreement than those who can. Consider the pair which must 
agree on how to divide the dollar or lose it. In the absence of communication, there 
is only one conceivable solution, namely 50-50. With communication there may be 
quibbling, because each may feel that he can get more if he threatens to block agree- 
ment and so to forfeit the dollar for both. 

In this connection Dr. Schelling’s proposal of a certain experiment is highly in- 
triguing. A group of people are to cooperate (not compete) for a prize by writing 
down independently of each other limitations on the use of nuclear weapons—any 
limitations they may think of. If all the proposals agree, they get the prize; otherwise 
the prize is forfeited. 

I take it, the result to be anticipated is total prohibition, not necessarily because 
most or even any feel that this is the most desirable alternative but simply because 
it stands the greatest chance of being tacitly agreed upon. As in the preceding ex- 
ample, a quick agreement with communication is not nearly so likely. Might there 
not be a germ of a notion here on how to get the masters of our destiny to agree? 

These ideas, especially some of the striking paradoxes are interesting and stimu- 
lating. I believe, however, that they indicate the necessity of transcending game- 
theoretical thinking (i.e., thinking exclusively in strategic terms) rather than the 
need to incorporate into the theory of games matters which do not fit into its concep- 
tual repertoire. This means certain matters about which (let us face it) it is embarrass- 
ing to speak in business and military circles, where the presumption is that the only 
rational mind is the calculating mind. The fact remains that there is no rationally 
justifiable conclusion that leads the two players of a Prisoner’s Dilemma game with- 
out communication to insure for themselves the largest joint pay-off. Such an out- 
come can result only if “irrational” considerations are allowed to determine the 
choice of strategy, for example, “solidarity,” “trust,” “the determination to do the 
right thing, no matter what the consequences may be,” etc. Such considerations have 
until now been anathema to the “realists.” Among the strategists, it is perfectly 
proper to advocate “calculated risks” based on bluff, blackmail, and intimidation, 
but risks based on trust (which admittedly may be misplaced, else the risks would not 
be risks) fall automatically outside the scope of strategy because the associated con- 
cepts are not even in the vocabulary of the strategist, 
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I believe, therefore, that the greatest value of Dr. Schelling’s far-reaching analysis 
of strategic conflict is in what it suggests to the thoughtful reader (as I am sure it did 
to the writer), namely that the very framework of thought in which the strategist 
must operate precludes a breakout from our present situation, in which a “tacit 
agreement” on the limits of mutual destruction seems to be the brightest prospect of 
the twentieth century. 


Consumption Patterns of the Aged. Study of Consumer Expenditures, Incomes and Sav- 
ings in the United States. Sidney Goldstein. Philadelphia, Pennsylvania: Wharton School 
of Finance and Commerce, University of Pennsylvania, 1960. Pp. xix, 304. $7.50. 


Janet A. Fisuer, University of Wisconsin 


onsumption Patierns of the Aged is too narrow a description for the monograph 
C Sidney Goldstein has prepared in the Wharton School’s Series entitled, “Study 
of Consumer Expenditures, Incomes, and Savings.” In addition to information on 
expenditures, the volume includes discussions of relevant demographie questions, 
income, income change, the ownership of certain durable goods and changes in assets 
and liabilities. Similar material for all age groups other than those at least 65 years 
old may also be found here. It is the text rather than the tables which concentrates 
attention on the age groups 65-75 and 75 and over. Information on the younger age 
groups is utilized for comparative purposes, and in certain chapters, the focus shifts 
to life cycle patterns. 

Although this study was undertaken primarily as an analysis of data gathered in 
the B.L.S. 1950 Survey of Consumer Expenditures, the author, a demographer and 
sociologist has clearly combed the pertinent literature in his own fields and in eco- 
nomics. He draws heavily upon a number of sources, not only for information to 
supplement his primary data, but also to support some of his findings and to indicate 
certain problem areas with respect to the economic status and behavior of the aged. 
As Goldstein states in the Preface, his orientation is “toward those interested in the 
practical problems of the aged rather than toward the theoretical aspects of consump- 
tion behavior.” 

Following an introduction which points out the advantages of these 1950 data over 
those contained in previous studies, and also some of their limitations, Goldstein 
devotes a chapter to demographic questions and another to income. A chapter on 
total consumption expenditures precedes the nine chapters which deal with the major 
consumption expenditure categories. These ten chapters include the examination of 
most or all of such independent variables as disposable income, family size and com- 
position, and occupation, each in conjunction with age. A separate chapter is devoted 
to the joint “effects” upon consumption expenditures of age and such other variables 
as region, city size, and race. Thereafter, Goldstein discusses non-consumption ex- 
penditures, saving and dissaving. The final chapters summarize the results in rela- 
tion to the findings of several other studies. 

The appropriate set of criteria for appraising this monograph are not easily chosen. 
Critical comments may vary from too little concern with statistical inference prob- 
lems in the substantive chapters to insufficient attention to questions of definition 
in the analysis of saving and dissaving. One may also quarrel with the author’s con- 
ception of a variable described as “age per se” (for example, pp. 44, 65, 109, 262). 

On the whole, however, it should and probably will be recognized that here is a 
wealth of information on the economic status and behavior of consumers in different 
age groups; that these age groups include not one but two categories for “the aged”; 
and that this set of data, unlike any previously available on the complete range of 
expenditures, are from a sample of 12,500 families chosen to represent the urban pop- 
ulation of this country. 
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The Population of Asia and the Far East, 1950-1980. Future Population Estimates by 
Sex and Age: Report IV. United Nations, Department of Economic and Social Affairs. 
New York: Columbia University Press, 1959. Pp. viii, 110. $1.50. 


Harowp F, Dorn, National Institutes of Health 


His is the fourth report presenting estimates of future population of the various 
f rnb of the world prepared by the Population Branch, United Nations, in 
accordance with recommendations of the Population Commission. Previous reports 
have dealt with the future population of Central America, South America, and South- 
east Asia, respectively. This report presents estimates of future population growth 
for Asia and the Far East, defined as the territory south of the Soviet Union and 
east of Iran, and including roughly one-half of the estimated population of the 
world in 1950. 

The future population of 27 countries and territories is discussed. For China 
(Taiwan), Ceylon, and the eleven countries and territories of South-East Asia the 
detailed population projections published in the third report are summarized and re- 
published; no new projections are given for India and Japan, but recent projections 
prepared by other authors are shown. 

The first part of the report briefly compares and summarizes the projections for the 
four major areas into which the region was subdivided. The remaining two-thirds of 
the volume is given over to a detailed presentation of the basic data and the methods 
used to prepare the projections for each of the countries and territories. The ap- 
pendix contains tables giving the estimated future population by detailed age groups 
for India, Japan, Pakistan, Republic of Korea, and mainiand China. 

The authors of this report faced a formidable task. With the exception of Japan 
and some of the areas with a relatively small population such as Ceylon, Malaya, 
Singapore, and Taiwan this region is characterized by a lack of reliable censuses of 
population and incomplete vital statistics. For mainland China with nearly one- 
fourth of the estimated population of the world, demographers have been unable to 
agree even upon an estimate of the total population until the first modern census 
taken in 1953. In addition, World War II and the unrest and turmoil following its 
termination increased the difficulty of distinguishing temporary demographic changes 
from more stable long time trends. The changes in government and social and eco- 
nomic conditions that have taken place in practically every country in this region 
during the past two decades cast doubt upon the value of the trends shown by the 
meager demographic statistics of the past as a reliable basis for future projections. 

The rather scanty data available suggest that the total population of this region 
has increased rather rapidly in the recent past in spite of war, famine, and disease. 
The population in 1950 was estimated to be one-third greater than that in 1920. This 
resulted primarily from a fertility rate so high that it more than counterbalanced a 
death rate that would be regarded as excessive by western standards. Since the end 
of World War II the mortality rate has been declining in most of the countries of this 
region, in some countries spectacularly, whereas the birth rate has remained near its 
former high level. 

Faced with an inability to reconstruct the past and confronted with rapidly chang- 
ing current social, economic and political conditions, it would seem at first glance 
that only captive bureaucratic servants, foolhardy gamblers, or daring survivors of 
Russian roulette would be so bold as to prepare and publish estimates of the popula- 
tion of the countries of this region for thirty years into the future. But after reading 
the discussions of the procedures used to prepare the projections I was filled with 
admiration for the ingenuity, caution, and flexibility displayed by the unnamed 
authors of this report. Regardless of what the future may reveal concerning the 
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accuracy of the estimates of future population, it is not apparent how the general 
methodology could have been improved. 

Recognizing that the most suitable method of estimation is determined by the 
nature and reliability of existing demographic statistics, the authors used a variety 
of procedures to project population growth for the various countries of this region. 
In general, they assumed that fertility will change slowly but that mortality will 
decline rapidly so that the rate of growth during the 30 years after 1950 will exceed 
the rapid growth during the 30 years prior to 1950. The estimated rates vary widely 
among the four major subregions into which the region was subdivided. The percent- 
age increase in population for Maritime East Asia is estimated as 51 while that for 
South-East Asia is 103. For the region as a whole, the 1980 population will be 72 per 
cent greater than that for 1950. 


Cancer in Families. A Study of the Relatives of 200 Breast Cancer Probands. Douglas P. 
Murphy and Helen Abbey. Cambridge: Harvard University Press, 1959. Pp. x, 76. $2.50. 


J. YerusHALMY, University of California, Berkeley 


HE first step in investigating the relative roles which genetics and environmental 

factors play in the etiology of a given disease is to determine whether or not it is 
“familial.” If the disease or condition is rare the “pedigree” method may be expected 
to provide definitive answers. For the more common conditions and diseases the 
usual method is to compare, by means of a retrospective study, the incidence of the 
disease among the relatives of patients having the disease with the incidence among 
relatives of a control group. In the operation of such an investigation many problems 
arise. The most serious are the selection of the cases and appropriate controls and the 
possible bias introduced by a difference in accuracy and completeness of response 
which may result from the fact that the relatives of the study group are likely to be 
more aware of the disease than are the relatives of the control group. 

The authors of the book under review are well aware of the difficulties of such in- 
vestigations and have taken precautions to overcome them. They were concerned 
with the question of whether there is an increased incidence of cancer among the 
relatives of 200 breast cancer patients admitted to 28 hospitals in the Philadelphia 
area compared with those of 198 women admitted to the University of Pennsylvania 
Dental Clinic during the same period. 

Murphy and Abbey found no evidence of a familial tendency to develop cancer 
of the breast or of any other site. This finding is not consistent with the results of 
other investigations which did find such a tendency. Included among these is a recent, 
well-conducted study by Anderson, Goodman and Reed.' Murphy and Abbey have 
gone to great lengths to present thorough analyses aimed at a comparison of the 
study and control groups, and they show that, for the factors which allow com- 
parison, there are no meaningful differences between tke two groups of relatives. 

The fact that two well-conducted studies are in disagreement in their findings 
points up the deficiencies still existing in this method of study. Much more work must 
be done on the serious question of selecting controls; it is likely that in many of the 
studies more than one control group must be used. 

Similarly, methods must be developed which will minimize, if not eliminate, the 
bias in response which may be present in retrospective studies. In this connection, 
the reviewer was impressed by Appendix II of this monograph, which indirectly 





1 Anderson, V. Elving, Goodman, Harold O., and Reed, Sheldon C., Variables Related to Human Breast Cancer. 
Minneapolis: The University of Minnesota Press, 1958. 
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bears on the subject. Because of a broadening of the objectives of the study after the 
first round of interviews was completed, a resurvey wes conducted which brought 
forth additional relatives not reported on in the original survey. In the main body of 
the monograph the data obtained in the two surveys are combined, but in an ap- 
pendix the data are presented separately so that the findings may be compared. 
Thus in effect a view is obtained of a sample of the relatives who were “unobserved” 
in the initial survey. The interesting finding is that among those discovered only by 
the resurvey the frequency of cancer was 10.5 per cent among relatives of the control 
group compared with only 4.7 per cent among the relatives of the breast cancer pa- 
tients. The explanation for this difference is not readily apparent in the text but it is 
at least possible that it reflects a difference in completeness of reporting in the initial 
survey. Relatives of cancer probands may have more readily recalled other relatives 
with cancer because of the recent experience in their own family, with the result that 
more of them were reported in the first round of interviews than in the case of the 
control group. 


Population and Family Planning in India. C. B. Mamoria. Allahabad, India: Kitab Mahal, 
1959. Pp. 167. $1.00. 


Haroup F. Goutpsmitu, Michigan State University 


HIs volume is another addition to the growing body of literature demonstrating 
the need for an effective population policy in India. While not explicitly stated, 
the book appears to have been written for government officials, administrators and 
medical personnel of birth control clinies, and India’s growing middle class. Pro- 
fessional demographers and others familiar with the literature about the population 
problems will find relatively little in the book which has not already been extensively 


discussed and documented in the literature. 

From the range of material presented, I would judge that the book has two goals, 
The first of these is to demonstrate to a lay audience that India has a population prob- 
lem that can only be solved by the general reduction of family size through family 
planning. This task is successfully carried out. The basic argument presented in the 
book may be summarized as follows: (1) “In relation to the existing stage of her 
industrial and agricultural resources India is definitely over populated.” (2) Any 
increase in population will decrease the standard of living of the Indian people which 
is already dangerously low. (3) As in other countries that are industrializing and 
for much the same reasons, India can expect an increasing rate of population growth 
unless its rate of population growth can be controlled. (4) The proper way to control 
India’s population is to limit the size of the family since long term industrialization 
is not a solution to the immediate problem of over population, and since the destruc- 
tion of surplus population is contrary to the moral principle that “human life... 
is an end in itself.” 

The second purpose of the book is to provide persons responsible for the dissemina- 
tion of birth control information an up-to-date summary of the following areas: 
(1) the pressure of India’s population upon resource, (2) the necessity of having a 
population policy, (3) the methods of population control, (4) the difficulties of carry- 
ing out a democratic population policy in India, (5) the current population policies 
of the Indian government and (6) the technical details of various birth control 
methods. 

Because of the monumental size of this second task, many important areas have 
not been discussed. For example with respect to methods of population control, not 
only is there no discussion of the population policies of any of the communist coun- 
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tries, there is no discussion of the population policies being considered and used by 
countries which like India are in the early stages of their demographic transition. 
Since the population policies of countries in western Europe which wish to increase 
their rates of population growth are discussed, it would seem reasonable to expect a 
discussion of the policies of countries like China, Japan, and Puerto Rico where the 
pressures of an increasing population are creating or can be expected to create social 
and economic problems similar to those of India. However, in spite of the fact that 
important topics are not discussed, given the audience of the book, an adequate job 
is done of carrying out the second task. 

The book as it now stands will be a useful one for the intended audience. How- 
ever, the book would have been more useful and better organized if the author had 
explicitly stated the problems he wished to discuss and the audiences he wished to 
reach. Then one could have more easily determined what range of topics are relevant 
to his discussion. Further, the book could also be improved if the problem dealt with 
in the book had been reduced to a more manageable size. 


Population Problem in India, A Census Study. P. K. Wattal. New Delhi: Minerva Book 
Shop, 1958. Pp. viii, 228. Rs. 10/-. 


J. ALLAN Beraue, Michigan State University 


} prone the sub-title of this book is not inappropriate, it fails to connote the full 
scope of the materials covered or to suggest the author’s great interest in popu- 
lation policy. This book is not a systematic census analysis in the sense of Kingsley 
Davis’, The Population of India and Pakistan. Rather it attempts to summarize or 
highlight India’s demographic position by making use of existing literature includ- 


ing numerous special commission reports. The function of this, however, seems to be 
a mere backdrop against which the author may state what should be done. 

Among the things that should be done, the author reasonably urges that a high- 
level governmental commission, modeled after the Royal Commission on Population, 
be established to investigate the population problem in all its aspects. “If we are to 
escape Communism,” says Wattal, “the population problem should be investigated 
... at the earliest possible date” (p. viii). The author insists upon the necessity of 
reducing the birth rate “from 40 to 20 per thousand, within a period of 20 years.” 
(p. 228) While the question of how this might be done is considered, nothing new 
has been contributed. Another suggestion made is that population statistics, now 
the concern of numerous agencies, should be centralized in a single agency both at 
the federal and state levels. 

The inadequate system of birth and death registration and the importance of 
improving it is well emphasized. Registration of births and deaths is “reasonably 
satisfactory” in less than one-third of India’s population. Of special interest to west- 
erners are Wattal’s descriptions of cultural features having an impact upon demo- 
graphic phenomena. For example, the high ratio of males to females, unlike most 
western countries, is explained in part by the fact that “sons are earnestly longed for 
while daughters are not wanted and are even differentially treated except in modern 
educated homes.” (p. 60) Wattal cites the case of a 22-year old woman whose desire 
for a male child landed her in jail. A mendicant told her that she would be so blessed 
if she would set seven thatched dwellings on fire. 

If the reader is able to join the author in feeling the same desperation and urgency 
over the population problem in India, the shortcomings of inadequate annotation, of 
unabashed value judgment, and of hasty generalization will diminish. 
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China’s Population, Census and Vital Statistics. S. Chandrasekhar. Hong Kong: Hong 
Kong University Press, 1959. Pp. 70, map. $1.20. 


Joun 8. Arrp, Bureau of the Census 


HIs small book contains three sections devoted to (1) the methods of the 1953 
Census of Mainland China, (2) published census data, and (3) figures on vital 
rates and population growth. 

The first section surveys some of the descriptive materials on the census available 
outside China. The second is based on official announcements and a paper given 
abroad by a non-Communist statistician from Peking. The third section presents 
some new Official data obtained by the author during his visit to China at the end of 
1958. 

In his introduction Chandrasekhar says that though the figures he presents are 
fragmentary and contradictory, he is witholding comments and evaluations for 
another book. One may question the advisability of presenting to the general public 
without critical review official population figures which the author himself suspects 
may be unreliable. But the question is irrelevant, for this book abounds in value 
judgments, many of which, unfortunately, betray a lack of demographic sophistica- 
tion. For example, a series of figures for the total year-end population of Mainland 
China shows a sudden unexplained change in the annual increase rate from just over 
two per cent for 1949-1956 to about three and a half per cent for 1957-1958. Chan- 
drasekhar grants that there were discrepancies among his sources for some of these 
figures, but in such cases, he says, he chose “the more ‘reasonable’ figure.” (pp. 55-6) 
However, these are not reasonable figures, nor are many of the other official statistics 
for Mainiand China, most of which Chandasekhar seems to accept without reserva- 
tion. 

Another weakness of the book is its failure to be explicit about sources of data. Such 
attributions as “from a communication to the author” (p. 54) and “from various 
Chinese sources” (p. 55) are not adequate identification. If the specific sources were 
confidential, the need for diplomacy is understandable; otherwise, it would have been 
helpful to know what agency released the data and how it had obtained them. 

This book is valuable mainly for its presentation of some figures not previously 
published or made available to other travelers in China. The gradual accumulation 
of such fragments adds to our knowledge of the statistical system of Mainland China 
and contributes something to the exploration of the unknowns in the study of China’s 
population. 


Handbook on Data Processing Methods: Part I, Provisional Edition. Food and Agricul- 
tural Organization of the United Nations. Rome: United Nations, 1959. Pp. vi, 111. $1.00. 
Paper. 


Caru F. Kossack, IBM Research 


s I read this handbook, I had the strong impression that for me to write a review of 
A it was somewhat of a contradiction since it has been expressly written to assist 
less developed countries in meeting the difficulties which many of them experience 
in the processing of statistical data. However, I determined to write a brief review of 
the handbook from the point of view of a person whose experience has been limited 
to developing computing and data processing laboratories in the United States and 
to request the editors to seek a supplemental review from an individual from one of 
these less developed countries. 
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From my point of view, the expositions of the several sections is almost pain- 
fully elementary and obvious. There are four chapters in the present handbook: 
Chapter 1. Scope and Principal of Data Processing; Chapter 2. Planning, Organizing 
and Administering Data Processing Service; Chapter 3. The Elements of Planning 
and Operating a Punch-Card Installation; Chapter 4. Manual Methods and Tools 
for Data Processing. There is an appendix on Punch-Card Sorting. Most of the ma- 
teriai and situations considered are those arising from the data processing aspects of 
a survey such as a national census. 

All too often the discussion introduces the problems that may be encountered with- 
out giving much of a clue as to how to approach a solution. In fact, the references 
given as footnotes to the various sections seem to be one of the more valuable aspects 
of the handbook. Chapter 3 on the planning and operating elements is a somewhat 
detailed outline of the factors that need to be considered in establishing such a data 
processing facility and, as such, should provide a valuable check list for individuals 
involved in such an organizational effort. 

I cannot help but feel that the mode of presentation, which is mainly of a written 
descriptive nature, could stand modification. The use of more diagrams and sketches 
would improve the comprehension while the use of actual examples of installations 
of various types would surely bring the concepts being discussed down to a concrete 
form. 


Digital Computers and Nuclear Reactor Calculations. Ward C. Sangren. New York: John 
Wiley and Sons, Inc., 1960. Pp. xi, 208. $8.50. 


A. T. Boarucna-Reip, University of Oregon 


HE complexity of the computations involved in the solution of mathematical prob- 

lems associated with nuclear reactors requires the use of high-speed computers. 
Hence, any worker in nuclear science or technology should have some knowledge of 
high-speed computers and their use in the study of basic reactor problems. The book 
under review is addressed in the main to nuclear scientists and engineers, and its 
primary objective is to present to that audience an introduction to high-speed 
nuclear reactor calculations. The author has clearly attained his objective; and in 
doing so has presented a concise and very well-written book. Following an intro- 
ductory chapter there are seven chapters and a bibliography. The chapter headings 
are: Digital Computers, Programming, Numerical Analysis, A Code for Fission- 
Product Poisoning, Diffusion and Age-Diffusion Calculations, Transport Equation- 
Monte Carlo, and Additional Reactor Calculations. The bibliography lists items 
separately for each chapter. The introductory chapter and first three chapters can 
be used as a “core” for an introductory course on high-speed computers and calcula- 
tions. The remaining four chapters deal with actual reactor calculations; and should 
be read by any applied mathematician or statistician who wants to gain some knowl- 
edge of nuclear reactor calculations, or who contemplates working with nuclear 
reactor research groups. 


Modern Factor Analysis. Harry H. Harman. Chicago: University of Chicago Press, 1960. 
Pp. xvi, 469. $10.00. 


Donatp F. Morrison, National Institute of Mental Health 


odern Factor Analysis is intended as a reference treatise on the technique from 
M its inception by Spearman to a present state of development that is heavily 
dependent upon high-speed computers. Its style and organization are didactic, so 
that it might also serve as a text for a comprehensive course in factor analysis. For 
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that purpose its detailed computing algorithms and illustrative examples are accom- 
panied by a large set of exercises and their answers. A bibliography of some four 
hundred titles is included. 

The body of the text is divided into four parts. Part I, Foundations of Factor 
Analysis, consists of six chapters that introduce the concepts and problems of factor 
analysis. Chapter 1 offers a brief history of the method and a résumé of certain of its 
applications. The second chapter presents the linear mathematical model for a vari- 
able in terms of common and unique factors. Chapters 3 and 4 contain geod exposi- 
tions of the matrix algebra and n-dimensional geometry essential to an understanding 
of factor analysis. Rank and communality are treated in Chapter 5, while Chapter 6 
is a summary of the classical factor solutions. 

Part II, Direct Solutions, contains five chapters that treat in detail the two-factor, 
bi-factor, principal axes, centroid, and multiple-group solutions introduced in Chap- 
ter 6. In distinction from these factor solutions obtained directly from the matrix of 
correlations, the four chapters constituting Part III are concerned with the relation 
of one factor solution to another in a common factor space. Chapter 12 is essentially 
concerned with the problem of orthogonal rotation of a solution to “simple structure,” 
while Chapter 13 treats oblique rotations to sets of correlated common factors. 
Analytical methods that permit orthogonal rotations to be performed objectively on 
a computer are discussed in Chapter 14, while similar techniques for rotations to 
oblique simple structure follow in Chapter 15. 

Part IV, Special Topics, contains a chapter on the evaluation of an individual’s 
factor scores from factor loadings, and a chapter on the maximum likelihood approach 
to factor analysis. 

Four of these chapters should be of particular interest to the statistician confronted 
with a problem calling for factor-analytic treatment. Chapter 9, Principal Factor 
Solution, is an excellent derivation and discussion of Hotelling’s method of principal 
components. This technique was the first to break away from the subjective qualities 
of the centroid solution and to employ a mathematically well-defined notion of a fac- 
tor. Hotelling’s iterative scheme for extracting latent roots and vectors is illustrated 
for two moderately large matrices, and a flow chart is presented for programming the 
principal component solution for a general high-speed computer. 

The accounts of the analytical methods of rotation contained in Chapters 14 and 
15 are especially useful. Examples of rotations satisfying the varimax and quartimax 
criteria are compared with those obtained by repeated graphical rotation. Similar 
criteria (the oblimax, oblimin, and quartimin procedures of Carroll, Kaiser, and 
others) for oblique rotation are also developed and illustrated. 

The final chapter, Statistical Tests of Hypotheses in Factor Analysis, is a valuable, 
though brief, summary of Lawley’s maximum likelihood solution for factors. How- 
ever, the virtue of the well-defined likelihood criterion as a means of explicating the 
vague notion of a “factor” appears to have been forgotten in the assertion that its 
resulting significance tests are probably no better than the cruder ones cited in the 
earlier chapters. A computing algorithm allegedly convergent at desk calculator rates 
for small sets of variables is stated and illustrated with one of the correlation matrices 
familiar to the book. The large-sample tests for the adequacy of the number of factors 
are also illustrated. 

Some of the content of Modern Factor Analysis can hardly be called modern. A 
disproportionately large amount of space is devoted to describing the centroid 
method, after the author has stated that it is but a crude approximation to the prin- 
cipal axes solution, and would not be used if a computer and program were available 
for that technique. In the reviewer’s opinion the chapters concerned with Spearman’s 
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two-factor solution and Holzinger’s bi-factor solution are excessively detailed, while 
no more than passing reference is made to Tryon’s cluster analysis. Spearman’s solu- 
tion has little more than historical interest, and the bi-factor solution requires a 
grouping of the variables into coherent subsets, a condition also shared by the mul- 
tiple-group approach. An analytical criterion is proposed for forming groups in 
matrices with no clear substantive partitioning, although no mention is made of the 
effect of sampling variation in the correlations composing the index. The often in- 
tuitively appealing Radex and Circumplex concepts of Guttman are not discussed in 
the book. 

Certain aspects and usages of the book are a trifle worrisome to the reviewer. The 
notion of a “factor” is employed rather loosely, and in the introductory chapter even 
extends to the canonical correlation situation, although this important variant is not 
treated in the text. The meaning of the linear expression for a factor introduced in 
Section 2.3 might be clearer if “mathematical model” were read for “theoretical 
form.” The identification of the factor model with linear regression in Section 2.6 
is a triviality, although the final sentence seems to contradict an assumption that the 
unique factor is a predictor rather than the error term of the regression equation. No 
initial mention of sampling from a larger universe and the use of 


== (i -2) 


lead one to assume that a given set of observations on the assembled variables consti- 
tutes the entire population. The distinction between sample and population is not 
acknowledged until the final chapter on the maximum likelihood soiution. 


Modern Elementary Statistics (Second Edition). John E. Freund. Englewood Cliffs, New 
Jersey: Prentice-Hall, Inc., 1960. Pp. x, 413. $7.00. 


Ray Hyman, General Electric Company 


—- this text has been “rewritten almost completely”, readers of the first 
edition of 1952 will recognize this second edition as basically the same book. 
Because the revisions are tactical adjustments to bring the book into line with cur- 
rent trends, those readers who welcomed the first edition will find the same features 
to admire in the current version. By the same token, those who had serious reserva- 
tions about the first edition’s approach to teaching statistics will find that the altera- 
tions are insufficient to reverse their opinion. 

The revisions, which are almost all in the direction of improving an already com- 
petent and sound treatise, consist of updated symbols, rearrangements of material, 
and greater emphasis upon tests of hypotheses. In the new edition the sample vari- 
ance is defined with n—1 in its denominator, two nonparametric tests—the sign test 
and the Mann-Whitney U-test—have been added, and short, useful discussions of 
how to calculate with rounded numbers and how to use square root tables have been 
appended. Although six new pages are devoted to an interesting discussion of decision- 
theory, to keep it from being a digression in an otherwise well-organized text, the 
instructor will have to make a special effort, by way of lectures and supplementary 
reading, to integrate this new material with the rest of the text. 

To make room for the additional material, Freund has dropped two chapters, 
“The Nature of Scientific Predictions” and “Statistics and Science,” from the first 
edition. The deletion of these chapters is unfortunate because both were attempts to 
bridge the gap between the student’s everyday experience and the concepts of statis- 
tica] reasoning. The short chapter on scientific predictions provided a good, heuristic 
introduction to the problems of evaluating the goodness of prediction devices. The 
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omission of the chapter on statistics and science places upon the instructor the sole 
responsibility for integrating the statistical concepts with the activities of social and 
natural scientists. 

Freund’s objective is to provide a text which emphasizes the broad concepts and 
principles of statistical reasoning without getting bogged down in detail. The text is 
intended for a one-semester course for beginning undergraduate students in the social 
and natural sciences. In this course no mathematics beyond the most rudimentary 
algebra is assumed. The book, in which the sequence and content of the topics follows 
the orthodox pattern, is divided into three major parts: Part I covers descriptive sta- 
tistics; Part II covers probability, estimation and tests of hypotheses; and Part III 
covers regression, correlation and time series. 

Within the context of his objective and the restrictions imposed by the limitations 
of his assumed readership, Freund has achieved a rigor and clarity of statement far 
superior to most non-mathematical introductions to statistics. In keeping with his 
intention to emphasize broad ideas, he has avoided padding and unnecessary frills. 
The students will most likely read all the material in the book, with the possible ex- 
ception of the chapters on index numbers and time series, during a one-semester 
course. Because of the heterogeneous interests of the students to be found in such an 
elementary course, Freund has conscientiously varied the content in the examples 
and exercises. 

These same features that result from an attempt to meet the peculiar demands of a 
heterogeneous, non-mathematical readership provide the basis for what some instruc- 
tors might see as deficiencies in the text. The effort to achieve breadth and generality 
imparts a feeling of artificiality to the examples and exercises. It is left to the instructor 
to provide flesh and guts to the statistical concepts and exercises, to show how statis- 
tics forms an integral part of the everyday work of the investigator in the various 
subject-matter fields. Despite the admirable rigor and precision of Freund’s writing, 
the instructor will have to display considerable ingenuity in providing the non- 
mathematical student with heuristic experiences which are sufficient to communicate 
a meaningful referent for such concepts as mathematical expectation, limit of a rela- 
tive frequency, confidence interval, operating characteristic, and sampling distribu- 
tion. Some instructors may feel that Freund’s failure to include computational 
checks, his omission of corrections for continuity in some of his statistical tests, and 
his relegating of some statistical tests to the exercises represents too great a sacrifice 
of detail for the sake of the more general principles. 

Because some of the treatment is practically self-contained—notably the material 
on descriptive statistics and some aspects of correlation—the instructor can safely 
leave much of the drearier, but necessary, coverage of the introductory course to the 
text and the exercises. This excellent feature of the text frees the instructor to concen- 
trate his lectures upon the more challenging aspects of introductory statistics. Al- 
though Freund deals competently with statistical inference, the instructor will almost 
certainly find that he will need to devote the bulk of the class time to amplifying and 
elucidating such issues as the reason for squared deviations, the use of n—1 in the 
denominator of the variance, two-tailed versus one-tailed tests, operating character- 
istics, the necessity for seemingly elaborate precautions and rituals in randomization, 
and the reasons why it is necessary (as Freund points out in his introduction but fails 
to drive home in the remainder of the text) to make sure that the data are appropri- 
ately collected and that the investigation has been properly designed before a statis- 
tical analysis is warranted. 

Considering his audience and his objectives, Freund has succeeded very well in 
producing an updated revision of a deservedly popular text. Whatever drawbacks or 
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faults the book may possess appear to be a direct consequence of the restrictions 
under which it was written. With the current emphasis upon upgrading the teaching 
of mathematics in secondary schools, let us hope that necessity of having to produce a 
non-mathematical text in statistics will quickly disappear. Meanwhile, where the 
need for such a text continues, Freund’s contribution is unquestionably one of the best 
entries in a highly populated field. 


Symbols, Definitions and Tables for Industrial Statistics and Quality Control. Industrial 
Statistics Committee, Eastman Kodak Company. Rochester, New York: Rochester Institute 
of Technology, 1960. Pp. vi, 202. No price listed. 


Geor@e J. Resnikorr, Illinois Institute of Technology 


HIS manual was first prepared in 1956 for the internal use of the Eastman Kodak 

Company, which felt a need for a standard set of statistical symbols and defini- 
tions to be used on a company-wide basis. It was revised and extended in 1958. Its 
subsequent publication by the Rochester Institute of Technology is a recognition of 
its possible usefulness to other practitioners in applied statistics. 

The manual contains definitions of symbols, a fairly complete set of statistical 
tables, short tables of random numbers, and of squares and square roots, and a con- 
siderable amount of other related material. 

Although all of these may be found elsewhere, they are presented here in a single, 
neat, and convenient package. The manual is bound by a plastic ring binder with 
clear plastic covers, and should prove even more durable than the conventional type 
of book-binding. 

Potential users of much of the material contained in this book will be familiar with 
symbols and definitions, however lacking in standardization. For those users who are 
not so familiar, the book will not serve as a text-book in applied statistics. Its most 
important use stems from its original purpose: to serve as a medium of standardiza- 
tion in order to facilitate communication about the results of the application of 
statistical techniques within a large organization. 


Europe’s Coal and Steel Community: An Experiment in Economic Union. Louis Lister. 
New York: Twentieth Century Fund, 1960. Pp. 495. $8.00. 


H. Lusewu, The RAND Corporation 


s A description of the problems and the workings of the coal and steel industries 
A of Western Europe, Lister’s book on the European Coal and Steel Community 
(ECSC) is encyclopedic in scope and masterfully done. The book deals with a num- 
ber of aspects of the background and functioning of the ECSC, and handles all of 
them with equal facility, sureness of touch, and thoroughness. The first part discusses 
what might be called the physical structure of the coal and steel industries: produc- 
tion and trade patterns, production costs, and the pattern of investments. The middle 
part deals with industrial organization: the patterns of market control which have 
succeeded, with varying degrees of supervision by the High Authority of the ECSC, 
to the operations of the systems of international cartels that managed the steel in- 
dustries, and to a large extent the coal industries, of Western Europe in the interwar 
period. The last part deals with several fields lying outside the direct scope of the 
ECSC: international trade policy of the member governments, transport rate prob- 
lems, and social policies; and includes a remarkably good chapter on Western Eu- 
rope’s energy market. 

One rather striking fact that appears from Lister’s description of the ECSC’s opera- 
tions is its supra-national character. In a variety of situations, decisions of the High 
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Authority concerning individual producers have the force of law (with teeth in the 
form of the power to withhold financial support, and the power to impose fines) and 
are put into effect by the law-enforcement agencies of the member states. 

Lister’s discussion of the organization of the coal and steel industries, particularly 
the steel industry, is in the TNEC tradition, dealing in considerable detail with the 
open and (where possible) hidden institutional arrangements controlling production 
and marketing. It is evident from tle discussion that the anti-trust provisions writ- 
ten, largely under American pressure, into the treaty setting up the ECSC have had 
little impact on the cartel arrangements regulating the European steel market. The 
pattern of steel production and marketing that has emerged was shaped primarily by 
the fact that there was a sellers’ market for steel in the reconstruction years after 
World War II. One effect of the post-war boom on cartel pricing behavior was to 
reverse the pre-war practice of maintaining domestic prices at a higher level than 
export prices. With domestic prices held down by government price controls, the 
steel producers tried to recoup their finances by taking advantage of the insatiable 
demand for steel in third markets. One of the ECSC’s functions at the time was, 
indeed, to restrain the increases in export prices for steel. More recently, in 1954 and 
again in 1958, recessions in economic activity have led to a return to the pre-war pric- 
ing practice. The ECSC’s main impact during this return to normality has probably 
been in providing an expansionist atmosphere for all the national steel industries in 
the Community to overcome the normal conservatism of the steel producers. 

One of the connecting threads of the book is a question, posed repeatedly: how does 
this affect the economic integration of the countries of the ECSC? In most cases the 
answer seems to be: not very much. There are nevertheless some indications that a 
European economy is being created. For example, during the minor recession of 1958, 
inter-country trade in steel among the members of the ECSC continued to increase 
despite a drop in exports to third countries, indicating that the trade patterns estab- 
lished within the Community during the post-war sellers’ market can at least be main- 
tained during a downturn of the business cycle (pp. 242-6). 

The book is not intended as a work of theory, and relies heavily on Scitovsky and a 
collection of papers edited by Chamberlin for a number of theoretical points regard- 
ing European integration. Lister’s outstanding contributions are his treatment of the 
institutional and statistical material, and the fact that he has plumbed all the pub- 
lished sources available to an insider intimately connected with the field and has pre- 
sented the results in a remarkably clear and intelligible fashion. The large masses of 
data are well handled, well presented, and clearly and thoroughly documented. This 
will be a basic statistical source and reference work for further research on Europe’s 
coal and steel industries for some time to come, and one that is a pleasure to use. Its 
usefulness would, nevertheless, have been increased if the references cited in foot- 
notes throughout the book had been repeated in a single bibliographical list and if 2 
more detailed table of contents setting out section headings as well as chapter head- 
ings had been printed. 


Government Policies and the Cost of Building. United Nations Economic Commission for 
Europe. Geneva: United Nations, 1960. Pp. vii, 165. $1.75. Paper. 


Ropinson Newcoms, Washington, D. C. 


HIS comprehensive report prepared by the ECA which deals almost entirely with 
Europe has many of the advantages and also many of the disadvantages of U. N. 
studies. On the positive side, it contains reports from 23 countries. These include 
reports from 6 countries behind the iron curtain, including one from Poland, plus 
one from Yugoslavia. And it includes reports from 16 other countries outside the iron 
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curtain. Many of these reports contain data which it would be difficult for anyone not 
acting for the U. 8. to assemble. It is a real advantage to a student of the subject to 
be able to use reports from so many countries assembled in one book. 

The disadvantages are the almost inevitable ones associated with this sort of report- 
ing. Official reports may leave impressions that are not always warranted by the facts. 
For instance, the anonymous U. 8. report briefly describes an engineering system 
which it is claimed “saves 10% to 25% on masonry labor costs.” Such systems have 
been known for over a half century at least. Taylor described one early in his career. 
Any European reader at least might get the impression that such scientific methods 
were generally used by the masonry trades in the U. S. But such systems are not used 
to any extent here. Other countries, on both sides of the iron curtain, also have put 
their best foot forward from time to time in this book. So only one well versed in what 
is going on can be sure of what to believe, or how important what he reads may be. 

Because the book is written by officials of many countries and edited for officials 
of many nations, it must be written so as not to offend. That means it must be read 
between the lines. For instance, it is only mentioned that “the cost of a house in 
Western Europe averages about four times the annual earnings of a male industrial 
worker”, and that this makes “it difficult to keep rents within 5% of the capital 
cost ... which this rent/income ratio presupposes.” (Part I, p. 9) And the report 
just mentions that one effect of governmental intervention on housing prices has been 
“to support an effective demand for the services of the house building industry at 
existing levels of efficiency and cost,” and adds “the question arises whether the type 
of intervention practiced by governments has not tended to diminish the incentive 
towards technical improvement.” It cost even more years of work for a European 
worker to buy his smaller, less equipped house than it cost the U. 8S. worker. But the 
implication is not made very clear-cut that aid to housing, while supporting housing 
volume, may have increased housing prices, and so reduced effective demand. 

The report must be read carefully, not only because the reporters from some coun- 
tries have selected their data, and because some significant implications are buried 
in the text or tables, but also because of some of the premises, implicit and explicit. 
For instance, the report is based on the assumption among others that “demand is 
comparatively inelastic and there is less prospect than for many other products of a 
significant increase in the amounts purchased in response to a decrease in prices.” 
This may be partially true in Europe but it is not true in general in the U. 8. 

But any student who wants to read carefully will find this a valuable compendium. 
The material is well organized. It starts off with a five-page summary of trends of 
costs and prices, moves to a 17-page analysis of factors affecting the technological 
development of the building industry, then devotes 25 pages to a description of tech- 
nological developments in different countries, 5 pages to facts about research in home 
building, and 3 pages to suggestions for governmental and international action. Fol- 
lowing these 55 pages of general facts and interpretations are some 78 pages of ap- 
pendices of a statistical and technical nature. 

The report emphasizes one factor which tends to be ignored in the U. S., “the in- 
stability of housing demand has had a marked effect on the structure of the house 
building industry.” 

The report finds the main characteristic of the house building industry to be its 
heterogeneity. Several suggestions are offered for improving the efficiency of the home 
building industry in Europe. All assume a continuation of present patterns of con- 
struction, and of government control. This assumption is a reasonable one. But little 
can be said by U. N. officials to encourage experimentation with U. S. methods. 
U. 8. methods, while far behind U. 8. potentials, can still provide better housing at 
less cost in terms of man hours, than do European methods and controls. 
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In the brief discussion of the impact of technical improvements on home building 
costs it is stated that in some projects labor costs were found to be two to three times 
that in other projects. In general the Communist countries have lower on-site labor 
costs in their multiple housing projects than Western Europe, though this saving is 
offset to a considerable degree by larger off-site and transportation costs. In general it 
was found that an increase by 1% in the amount of equipment on the site could cut 
the hours of labor required at the site by about 10%. So further mechanization of 
erection operations is found desirable. 

The amount of private home building going on behind the iron curtain may sur- 
prise some. It is reported as ranging from 24% in East Germany, through 57% in 
Russia, to 90% in Romania. (Part 1, page 33) This type of building is largely known 
in the U. S. as sweat equity, or self-help housing. The Communist countries have 
found home owners are likely to work harder for themselves than for the government 
so that there can be a greater economy of labor and materials in allowing people to 
build their own homes than in building for them. However, multi-family structures 
are built by the government, in Communist countries. 

The document concludes that mass production of components will grow in the 
industrialized countries, and in other countries better utilization of existing methods 
and techniques may be the most hopeful approach to lower costs. The evidence sug- 
gests productivity is improving and will continue to improve, but “the rate of change 
will accelerate only when the industry is invigorated by an influx of younger men 
trained in modern industrial methods and outlook.” (Part 1, page 49) 

The recommendations for international action or further inquiry will frighten few. 
They include improvement of data on productivity, better coordination of the par- 
ticipants—government, planners, architects, bankers and builders—more stand- 
ardization and modular coordination, better training, better exchange information 
between countries, and an examination of building codes. 

The Russian report will interest many. The numbers (2.7 million units in 1958, for 
instance) are impressive. And so is the use of precast concrete, and the technological 
improvements mentioned. The size of the units, and the average quality, however, 
still leave much to be desired. 


Postwar Market for State and Local Government Securities. Roland I. Robinson. Prince- 
ton: Princeton University Press for the National Bureau of Economic Research, 1960. 
Pp. xxiv, 227. $5.00. 


Davin A. Barerneorr, University of Oregon 


Mores bond sales have received little attention from students of finance in 
spite of their impact on markets for funds. These sales have accounted for more 
than one-fifth of the gross volume of new securities publicly offered in the 1946-56 
period. Thus, Robinson makes an important contribution in presenting a well-organ- 
ized description of recent activities in this segment of the money market. 

The author has concentrated on tracing the implications of rapidly rising interest 
rates of these tax-exempt securities. He finds that state and local governments have 
had to “bargain away” a great part of the advantages that they had formerly held 
because of the fact that their interest payments are exempt from federal taxation. 
With the burgeoning needs for schools, roads, sewers, and other municipal construc- 
tion projects, there has been a great increase in the demands of states, cities, and a 
host of new districts for long-term funds. Facing this, there has been a relatively 
shrinking supply of funds from investors who are primarily interested in the privilege 
of tax exemption. 

The first chapter includes a broad discussion of the problems of the market with 
brief comments on the shaky foundations of the federal tax-exemption privilege, the 
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effects of this feature on selling other securities, problems in collecting data, and an 
outline of the whole investigation. Statisticians will be interested in this account of 
collecting data, for it introduces the high obstacles of finding relevant materials in 
this field. 

In the next two chapters, Robinson discusses the demand and supply sides of the 
market. He emphasizes the particular difficulties that state and local governments 
face in picking an optimal timing of their demands for funds. The author suggests 
that the volatile swings in municipal bond rates may be attributed partly to the 
changing demands for bonds among the different suppliers of funds to these markets. 
This is noted most strongly in the rapid changes that commercial banks have made 
in their bond holdings. 

Detailed descriptions of the marketing process in the new issues and secondary 
bond markets are contained in Chapters 4 and 5. In bringing up-to-date some of the 
familiar materials in this area, the author outlines carefully the dual role that com- 
mercial banks play in the market, as final holders of bonds as well as important mem- 
bers of underwriting syndicates. While emphasizing some of the buying strategies 
employed by the expert underwriters, Robinson has included some highly interesting 
examples of the coupon structures of serial bonds that have been devised. His findings 
indicated that “moderate sized local governmental units seem to fare quite well in 
the new issues market, often better than the big cities.” In other words, the market for 
new issues has been so highly organized that even small issues of a given credit qual- 
ity can be handled at a reasonable cost. What factors determine “credit quality” is a 
topic that Robinson did not probe, even though he compared bond rates grouped 
according to the familiar Moody ratings. 

Probably the weakest section in the book is that devoted to the secondary market, 


which comprises the trading of bonds issued at some earlier date. Data for this 
analysis are particularly elusive. Within the limitations imposed on him, however, 
the author has succeeded in outlining the major factors that must be considered in 
discussing this sector of the bond market. The final chapter, that takes up the topic 
of revenue bonds, also seems to fall somewhat below the level of the remainder of the 
book. It appears that Robinson has had to compress too much material into too brief 


a space. 

The major analyses of the book appear in Chapter 6 where Robinson has devel- 
oped most carefully his major conclusion that “the privilege of the tax exemption had 
to be bargained away for less and less to investors for whom tax exemption had a 
relatively low marginal value.” An additional finding is that the interest rate differ- 
ential between lower credit quality and higher credit quality persisted more than the 
comparable differential in the corporate field. As a corollary to his primary thesis, 
Robinson has performed an impressive analysis showing that potential revenues lost 
to the federal government have been steadily increasing relative to the saving of bor- 
rowing costs by state and local governments. Furthermore, he found that this 
discrepancy was the greatest for lower grade securities. 

In many passages, this book contains stimulating suggestions of paths for further 
research. In other spots, there are implicit indications of topics that should be investi- 
gated. In pointing out a few of these, we may start with the concept of “quality” in 
these bonds. There appears to be a definite need for a closer analysis of factors deter- 
mining agency ratings and the interest rate experience of “unrated” bonds. This 
work might parallel Hickman’s analysis of the corporate bond market or proceed 
beyond. How have municipal bond issue volumes and rates varied with cyclical move- 
ments? Since Robinson was concerned with a relatively short period, he spent little 
time on cyclical analyses. Finally, there is a question of regional influences in munici- 
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pal bond markets. In several places, Robinson pointed to the “parochial” influence on 
banks and individuals, the tendency for bonds in the secondary market to drift 
back to neighborhoods of their origin, and various differences between local and na- 
tional underwriting syndicates. All of these hint that the municipal bond market con- 
tains a multitude of potential projects for economic and statistical research. 

Some may be disappointed because of the lack of a more sophisticated statistical 
analysis of the data that were available. On the other hand, it is clear that such an 
approach was contrary to the aims of the writer. Robinson is primarily interested in 
presenting a broad survey of the postwar municipal bond markets. 

Since this aim was achieved remarkably well, this book is recommended as required 
reading for all who need a knowledge of contemporary money markets. 


















Price Determination in Oligopolistic and Monopolistic Situations. Wilford J. Eiteman. 
(Report Number 33, Bureau of Business Research.) Ann Arbor: University of Michigan, 
1960. Pp. 45. $2.50. 










WILLARD Sparks, Michigan State University 


HIS monograph reports Mr. Eiteman’s findings from interviewing several manu- 

facturing firms and his reconciliation of their answers into a theory of oligopolistic 
and monopolistic behavior as it relates to price determining procedures. The essentials 
of the book are found in three chapters. The first of these is devoted to explaining the 
equal profit curve of a firm (i.e., given the profit in the last time period and the aver- 
age variable cost, the equal profit curve determines what change in volume would 
have to accompany a change in price to retain the same profit). Chapters III and 
IV discuss the use of the equal profit curve as a tool for management decisions regard- 
ing the price setting policies of oligopolistic and monopolistic firms. 

The entire presentation is based on a hypothetical illustration of a firm. Given the 
initial conditions, price of the product, quantity sold at that price and variable cost 
per unit, an attempt is made to explain how pricing policy is determined by managers. 

Mr. Eiteman’s findings indicate that managers have very little knowledge about 
the location of the demand curve, except the location of the point at which they are 
now operating. They know only that a price increase (decrease) would decrease 
(increase) the quantity sold, although they do not know the amount of the change 
in quantity. Therefore, they could not use marginal analysis in arriving at a decision. 
The decision rule used by managers was one of maximizing annual profits. This rule 
was based on the use of the equal profit curve and the assumption of constant per 
unit cost within a reasonable range of the initial conditions. 

In pointing out the decision procedures used to set prices, two (of the book’s 45) 
pages were devoted to presenting a conversation between the president and the vice 
president of the hypothetical firm. This conversation points out how marginal analysis 
could be used if the demand curve were known and how the final decision is based on 
the equal profit curve computed from the balance sheets of the firm. Of the 43 remain- 
ing pages, five are used to present all the possible demand curves that could exist if 
only one point on the demand curve is known. 

Mr. Eiteman concluded that marginal analysis and the use of the equal profit curve 
gives the same results with regard to setting a price. However, he fails to bring out 
that this only holds under the assumption that the firm is maximizing profits at its 
present price-quantity relationship. If this is not the case, it would not be possible to 
find an equilibrium with the analysis Mr. Eiteman suggests. 

Although this book may fall short of the title, only through research such as this 
are the theories and procedures that are used by management to set prices going to 
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be qualified. Since there is wide scope for such improvement any attempts may be 
worthwhile. 


Local Impact of Foreign Trade, a Study of Methods of Local Economic Accounting, a Staff 
Report. W. Hochwald, H. E. Striner, and 8S. Sonenblum. Washington: National Planning 
Association, 1960. Pp. xvii, 213. $7.00. 


H. O. Carter, University of California, Davis 


xm objective of this report is to establish an empirical basis for measuring the 
impact of foreign trade on local communities. The familiar input-output account- 
ing scheme is used to quantify interrelationships among local industries and other 
sectors of the National economy in three pilot communities: Fulton County (Glovers- 
ville), N. Y.; Kalamazoo County, Mich.; and Mobile County, Ala. These areas were 
selected as representative of different foreign trade interests. 

The importance of considering indirect as well as direct effects of foreign trade is 
emphasized in Chapter I and substantiated by the results reported in Chapter II. 
The third and final chapter stresses applications and gives a brief evaluation of the 
reliability of the findings. Also, the authors have made a detailed summary and ap- 
praisal of the procedure, which should prove extremely useful for persons undertaking 
similar studies in other areas. 

The appendix, accounting for more than one-third of the report, provides excellent 
supporting material for the body of the text; of particular interest is the discussion 
concerning the simulteneous use of published and survey data for constructing local 
accounts. Most regional input-output studies have relied almost exclusively on pub- 
lished data sources. Five technical supplements, available upon request, give addi- 
tional background material on the study areas and local accounting methodology. 

The limitations of the input-output technique are well known to most readers and 
also recognized by the authors. The method allows only “short-run” estimates of the 
direct and indirect stakes of a community in foreign trade. The community’s ultimate 
“injury” or “advantage” associated with particular trade policies are not measurable 
by any single technique—nor is such a claim made for input-output accounts in the 
report. 

In summary, this report is a noteworthy addition to the other fine studies supported 
by the National Planning Association. 


The Money Supply, Money Flows and Domestic Product in Finland 1910-1956. Kaarlo 
Larna. Economic Studies XXIII. Helsinki, Finland: Finnish Economic Association, 
1959. Pp. 227. No price listed. 


Karu Brunner, University of California, Los Angeles 


HIs book may be characterized as a highly competent application of econometric 

methods to a description of observable monetary patterns on the basis of a mini- 
mum of theoretical content. After a short introduction in which the author discusses 
the general situation of monetary theory, he turns in Chapter II to the concepts of 
money supply, money flows and marketed domestic product. Suitable formulations 
with appropriate semantic rules are considered, and the final choice of definitions, 
particularly with respect to the money supply, reflects the institutional frame of the 
Finnish monetary system. Chapter III deals with the decomposition of some of the 
time series gathered into three components: seasonal, random and a combined trend 
and cycle element. Tornquist’s method of computing seasonal patterns and appraising 
their changes over time is explained in detail. The results indicate that both amplitude 
and shape of the seasonal movements of Finnish payment procedures and money 
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market data experienced radical shifts over the decades. Some suggestions are offered 
concerning possible explanations of the seasonal variations in currency and checking 
accounts in terms of receipt and expenditure patterns of major industries. The random 
variations are determined by a smoothing process (Spencer’s 15-term third-degree 
parabolic graduation formula) applied to each time series. The major result obtained 
by this procedure is that the random variations in time series on clearings seem to be 
twice as large as the corresponding variations in series on checking accounts. 

Chapter IV considers some relations between money supply, money flows, mainly 
represented by clearing data and debits to deposit accounts, and the marketed domes- 
tic product. A number of sections deal with the usual velocity concepts and their 
measurement. The behavior of both transaction and income velocity is described, and 
we note particularly that velocity seems positively associated with cyclic movement, 
probably with a lag at the turning points. The central sections of this chapter are 
devoted to an econometric analysis of the relation between marketed domestic prod- 
uct, on the one side, and debits to deposit accounts or clearing volume joined by vari- 
ous money supply components on the other side. The general schema underlying all 
the regressions actually estimated may be presented as follows: 


3 5 
y = ao + D aes + Do dix; 
i=l j=l 


The dependent variable is the log of the marketed domestic product and the 2; are 
the logs of the clearings of the Bank of Finland, debits to check accounts at com- 
mercial banks and charges to deposit accounts at commercial banks and other insti- 
tutions. The z; are the logs of central bank money, account money, cash holdings of 
monetary institution, the money supply, and deposit accounts. The various models 
differ according to their a priori assignment of zero to selected a; and b; coefficients. 
The author also estimated some variations on this theme which involve first and sec- 
ond differences among the explanatory magnitudes. 

These models do not emerge from the construction of any hypotheses concerning 
the monetary mechanisms of the Finnish economy. We may view them as empirical 
formulae predicting an index of the economy’s general performance in terms of mone- 
tary magnitudes which are rapidly measurable and apparently easily available. 

Concern for computational procedures and the purely descriptive unfortunately 
dominate the book. There is barely any economic analysis guiding the conceptual dis- 
cussions on money supply and velocity or the selection of relations and variables to 
clarify the structural properties of monetary mechanisms. The book also reveals that 
a high competency in the use of econometric methods offers no protection against 
basic methodological naivetes expressed by the discussion on the nature of Keynesian 
theory and the constitution of an adequate hypothesis (in Chapter I), or by the com- 
plete disregard that concept formation is an integral part of the construction of an 
empirical hypothesis and cannot be separated from this process and the associated 
appraisal (Chapter II), or the idea that the regressions estimated contribute to verify 
the Fisherian equation (in Chapter IV). 


The Scientist in American Industry. Simon Marcson. Princeton, New Jersey: Industrial 
Relations Section, Princeton University, 1960. Pp. ix, 158. $3.00. Paper. 


Cart F. Kossack—I] BM Research 


HIs report is based upon a two years’ study of a large industrial research labora- 

tory, where scores of persons at all levels were interviewed. The problem to which 
this study is addressed is the conflict in goals of the traditional business enterprise 
and the individual scientist. 
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The report is very well written, blending together interview quotations with the 
discussional material. The extensive use of footnotes for the definition of terms seems 
to be superfluous since I found it difficult to keep up with the thread of the argument 
if I paid attention to the footnotes. It is therefore recommended that at first reading 
one simply skip these. The central theme of the report is the need for entirely new 
principles of management in the industrial research laboratory. Unfortunately, 
though the problems that create this need are well documented, there is no solution 
to the problem given nor even suggestions as to what these new principles might be. 

Since the emphasis deals with the physicist engineer scientist, the statistician will 
find the study of only general interest. In fact, the statistician in industry presents, in 
my opinion, an entirely new set of goal-type “conflicts.” Although the study is re- 
stricted to a single laboratory, the general findings and conclusions seem to be valid 
over a large class of industrial situations as far as scientists are concerned. I recom- 
mend that all new college graduates in science, including statistics, who are con- 
sidering a career in industry, read this report before deciding between an industrial 
or an academic opportunity. The report should also be on the must list of industrial 
and governmental management personnel who are responsible for the administration 
or programs involving scientists. 

Economic Arithmetic. Robin Marris. New York: St. Martins Press, 1960. Pp. xvii, 344. 
$4.50. 


Joun P. Henperson, Michigan State University 


RITTEN for the comprehension level of second and third year Honors students 
W: British universities, this text is restricted to a general survey of the methods 
and uses of three tools: national income accounts, time series analysis, and index num- 
bers. Ignoring for the moment that the discussion of income accounts is directed to 
current procedures and practices in the United Kingdom, the volume would be an 
excellent one in this country for a first year graduate course in statistical research 
methods for economics and business students. We have become accustomed, of course, 
to a high level of exposition and good writing style from British economists and on 
this score the volume under review is decidedly first rate. Would that American text- 
books were so well conceived. Marris makes no pretense at anything very new and 
original; “such originality as claimed lies in methods of presentation of arguments 
rather than in arguments themselves.” (p. vii) Professors of economics would, I think, 
be quite happy if all doctoral candidates knew how to use and manipulate these few 
simple tools as well as appreciating when not to employ the more advanced tech- 
niques; all of this they could easily learn from this volume. 

According to Marris, “economic arithmetic” is a broader subject than “economic 
statistics,” in that the former covers the integration of economic theory with statisti- 
cal technique. The major problem of economic arithmetic is to convert data into a 
form whereby it can help in the understanding of economic life. This approach is 
stressed particularly in Part III, and the discussion of index numbers. 

So far as this reviewer is concerned, one of the more important themes of the vol- 
ume is that “by far the most elemental, and most essential, procedure for analysing 
statistical data is one which seems so obvious that it is easy to forget to mention . . . 
the method of simple inspection.” As Marris so rightly says, “over and again we see 
the inexperienced student performing elaborate computations designed to test some 
relatively far-fetched hypothesis when another simple explanation of the movements 
of the data is staring him in the face.” This does not mean, of course, that there is not 
a large role for advanced mathematical and statistical procedures in economics, but 
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merely that much of our basic economic data has a questionable origin, and legiti- 
macy can not be breathed into it by using sophisticated procedures and methods. 
What one needs for analysing a large majority of our economic data is not mathe- 
matical technique but good old simple “arithmetic”—ratios, percentages, simple 
averages. 

In Part I, “Anatomy,” Marris surveys the sources and procedures of the National 
Income Blue Book of the United Kingdom. Anyone wishing a brief and succinct review 
of current national income procedures in the U. K. is directed to these four chapters. 
The material is presented in a very clear fashion and against a background of two 
excellent flow charts that are far superior to those found in the majority of economics 
textbooks. In addition, there is excellent commentary on the adequacy, more often 
inadequacy, of official accounts and emphasis on the fact that most data is collected 
for purposes other than for providing information on the economy’s income and outgo. 
In this respect, of course, there are numeruus similarities to the U. S. case, and one 
finds much the same type of criticisms as those leveled at the Commerce Depart- 
ment’s material. The theme in both instances is that governments do not collect the 
data which economists and statisticians would like to have for constructing better 
national income accounts. The major difference between Marris’ treatment of na- 
tional income accounts and that of most textbooks in this country is that the latter 
gives relatively little attention to the sources and adequacies of the basic data; the 
reviewer finds Marris’ treatment more realistic. 

In Part II there are “some notes on the elementary aides to inspection,” referred to 
earlier, along with two chapters on time series and simple and multiple regression 
analysis. In addition to the usual treatment of the various statistical techniques, 
again well carried out, there is a good deal of stress placed on giving the student some 
guides as to when one should or should not use the several methods. There are some 
“pauses for algebra” but even then the material should not offer much difficulty to 
the student with little mathematical background since there is no derivation of 
formulae. 

Index numbers receive a more complex treatment, but, of course, they are 2 com- 
plex problem. The discussion of index numbers is centered around the fact that this 
particular tool is essentially a means to measure changes in economic welfare—“a 
thing which cannot directly be measured.” There are chapters on both changes in 
real family income (9) and real national income (10). After describing in a straight- 
forward way the genera] properties of index numbers, along with an algebraic ap- 
pendix, Marris moves in chapter nine to linking Laspeyres’ and Paasche’s indices to 
the theory of value by means of indifference curves. It is in this section that the vol- 
ume has its most originality, as Marris discusses the general theoretical issue of what 
is at stake in the use of index numbers. This raises the question of the degree to 
which Fisher’s index is “ideal” and the relation between economic theory and the 
problem of measurement. Indeed, chapter nine does a good job of showing the prob- 
lems connected with integrating value theory and statistical techniques. Next time 
the reviewer teaches a course in price theory all of the chapters in Part III will be 
used, since the discussion is an excellent review of the relation between economic 
welfare, value theory and index numbers. 

One point on the history of thought. Marris gives credit to Professor Busche- 
guennce, of Moscow University, for linking the Fisher index with indifference curve 
theory, while Erland V. Hofsten (Price Indexes and Quality Changes, Stockholm, 
1952) gives credit to A. A. Koniis, also a Moscow professor. In each instance the 
date of publication is given as 1924. The question is, who published first? 
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The Economics of Competition in the Transportation Industries. John R. Meyer, Merton J. 
Peck, John Stenason, and Charles Zwick. Cambridge, Massachusetts: Harvard University 
Press, 1959. Pp. xvi, 359. $7.50. 


Hersert D. Mourina, Transportation Center, Northwestern University 


HE basic premises of Competition in the Transportation Industries are that com- 
"Deatidon leads to “at least a fair approximation to an optimum in economic affairs,” 
and that “regulation is essentially a substitute for competition in the protection of 
the public interest.” The Messrs. Meyer, et al. have therefore set for themselves the 
task of assessing “the extent of competition in transportation markets and” describ- 
ing “what the industry’s probable structure would be if these competitive forces were 
released from regulatory restraints.” On the basis of this assessment, they have 
sought to formulate the elements of an improved public policy toward transportation. 
In broad outline, the steps between expression of their basic value premises and their 
final policy recommendations entail: a) marshalling a considerable array of data on 
transportation cost characteristics, b) utilizing these data to specify a rational 
(i.e., cost minimizing) allocation of transportation resources, ¢) assessing the effects 
of present day pricing procedures and possible modifications to them on the attain- 
ment of such a rational resource allocation, and d) analyzing the market structure of 
transportation industry to determine the degree to which existing or latent com- 
petitive forces would provide an effective substitute for regulation. 

For many, the most valuable portions of this study will prove to be the sections 
dealing with transportation cost characteristics. Of these, Chapter 2 purports to give 
“some familiarity with the underlying theoretical concepts of production and cost 
theory,” and to provide a critique of alternative cost estimating procedures. Chapters 
3-5, respectively, set forth the operating cost characteristics of railroad, highway, 
and other modes of transportation. Finally, Chapter 7 describes an analysis under- 
taken to determine the average value to shippers of truck transportation associated 
shorter delivery times and lower shipment sizes required to achieve minimum costs. 

Their findings on cost characteristics (largely based on published accounting data) 
defy any attempt at brief summarization. A few words of description do, however, 
seem in order on their railroad cost estimates. In this analysis, the operating expense 
accounts of 25 of the 27 largest railroad systems (New York Central and Pennsyl- 
vania were excluded for a variety of reasons) for three groups of years (1947-50, 
1952-55, and 1954-55) were separated into six broad categories: general, traffic (i.e., 
marketing), station, line haul, yard, and maintenance. Least squares techniques 
were used to relate individual expense categories or subcategories to a linear combi- 
nation of one or more measures of output (e.g., gross ton miles of freight or passenger 
traffic, yard engine hours) and a measure of size of plant (e.g., miles of track). The 
coefficients of the size of plant variables proved insignificant for the general, traffic, 
line haul, and yard expense categories. Thus, it would appear, for example, that the 
additional line haul costs incurred in carrying an additional ton mile of freight on a 
large, lightly utilized system are essentially the same as on a small, intensively utilized 
system. The “size of plant” coefficients were substantial, however, for station expenses 
and most of the maintenance expense sub-categories analyzed. Furthermore, the 
constant terms in most of the regression relationships—interpreted as “threshold 
costs that must be met before the lowest level of marginal costs can be attained”— 
were generally quite large. So, too, were the constant terms in relationships developed 
between various capital investment categories on the one hand and gross ton miles of 
freight and of passenger traffic on the other. 

Accepting their results at face value would suggest, then, that scale economies of 
one sort or another do unquestionably exist in railroading. This, in turn, implies that 
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railroad mergers would reduce the economy’s aggregate railroad transportation bill. 
It further suggests that marginal cost prices could be instituted by railroads only if 
they were at the same time subsidized. Unfortunately, the Messrs. Meyer, et al. 
neither provide estimates nor tabulate their data in a fashion enabling independent 
estimation of the magnitudes of either merger-related savings or subsidy require- 
ments. 

In Chapters 6 and 7, the authors utilize their cost estimates to specify a rational 
allocation of transportation resources. Among their conclusions are: 1) that bulk 
commodities should move either by water or pipeline wherever feasible; 2) that, when 
compared to pure truck movements of high value commodities for 200 miles or more, 
the lower line haul costs of rail car load and piggyback operations—particularly the 
latter—offset the higher terminal costs entailed in these operations; and 3) except, 
perhaps, for short haul, high density coach movements, there is little economic justi- 
fication for rail passenger operations. 

Such an allocation of transportation resources could, they contend in their conclud- 
ing chapter, be achieved with a considerable reduction in the scope of regulatory ac- 
tivity. Specifically, their principal policy reeommendaticns run along the following 
lines: 1) maintenance of rate regulation only on those forms of transportation where 
elements of monopoly continue to exist (e.g., rail movements of some bulk commodi- 
ties, certain pipeline operations, and rail piggyback services) and complete elimina- 
tion of rate regulation by carrier rate bureaus as well as by public bodies in all forms 
of transportation where entry is easy; 2) termination of unremunerative services; 
3) limited mergers—mainly of the very smallest railroads—but only to the extent 
that increased efficiency would result; and 4) ultimate extension (“wherever possible”) 
of user cost pricing to all areas where transportation facilities are publicly provided. 

The study is not without its shortcomings. The clarity and rigor of their discussions 
of points in elementary economic theory could have been improved considerably. A 
more complete analysis of the errors inherent in the procedures used to estimate rail- 
road costs would have been helpful. I particularly missed a discussion of the relevance 
to their work of comments by Stigler, Friedman, and others concerning the regression 
fallacy and its implications for the use of cross section data in estimating cost rela- 
tionships. 

The book contains several minor factural inaccuracies. None of them—at least 
none of those I uncovered—affects the validity of their conclusions. What they have 
to say about airline competition in Chapter 8 is, for example, no less true because 
there were twelve domestic trunk airlines in 1957, not thirteen as stated on page 228. 
Unfortunately, however, the existence of these minor inaccuracies might well serve 
to limit the degree of acceptance of their conclusions by those most in need of en- 
lightenment—their institutionally oriented academic confreres and decision makers 
in various state and federal regulatory agencies. 

The analysis involved in assigning a speed-related premium to truck transporta- 
tion is rather one-sided. For many shippers, speed is more often a curse than a bless- 
ing. This is particularly true for commodities—canned goods and plywood, to name 
but two—for which production and consumption cycles are out of phase. By select- 
ing a routing that maximizes the number of freight yards through which a winter 
shipment of lumber passes on its way east, a west coast producer can—and typically 
does—increase time in transit thereby cutting his warehousing costs considerably. 

Apart from these perhaps trivial objections and a few more of similar character, I 
do have a rather fundamental criticism of the analysis leading to one of the major 
conclusions of their study. The desirability of extracting from users the long run 
marginal costs of publicly provided transportation facilities (e.g., highways, canals, 
airports) is stressed at several points in their discussion. However, they erroneously 








460 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1961 


identify the capital costs of these facilities with long run marginal costs. For this 
reason, most of the discussion of highway costs, for example, revolves around the 
logically impossible task of developing logical standards for allocating the joint costs 
of highway construction among various user groups. 

The true economic costs of using highways are not those entailed in amortizing 
their initial construction costs. They are, rather, use associated maintenance costs 
and the costs users impose on each other—the costs associated with the fact that it 
requires more time and dollars to travel on a road when it is heavily than when it is 
lightly utilized. Efficient utilization of a highway would require taxing each indi- 
vidual user by an amount equal to the congestion costs his trip imposes on other users. 
Such efficiency tolls or taxes would be substantially higher at times of peak loads— 
morning and afternoon rush hours, in particular—than during off peak hours. This 
means, then, that an efficient highway tax system would probably entail substantially 
heavier (not lighter as they suggest on page 268 and elsewhere) taxes on automobile 
commuters than are presently levied. 

In conclusion, this volume attacks what is both an important and a timely subject. 
As the Messrs. Meyer, et al. note in their preface, transportation—particularly rail- 
road—problems have once more become a major domestic public policy issue. Unless 
something is done soon, it is commonly claimed, transportation service will be greatly 
curtailed, a large number of railroads (particularly in the Northeast) will face bank- 
ruptcy, and commuters will either be stranded or will drown our major metropolitan 
areas under a sea of private passenger vehicles. The remedies commonly espoused 
have a dreary ring: merger of the nation’s railroads into one, two, seven, eleven or 
some other magic number of major systems; government subsidy in the form of tax 
reduction or elimination, low cost loans, or outright payments; and stricter and more 
comprehensive regulation (except, of course, of the carrier group making recommen- 
dations). 

The presentation of a contrary point of view has been long overdue. In arguing for 
seeking a solution to today’s transportation problems through increased reliance on 
competitive mechanisms, the authors of Competition in the Transportation Industries 
have performed a considerable service. 

True, the study does have a number of shortcomings. For at least two reasons, 
however, this is hardly a damning criticism. Except perhaps for the last one (and it, 
after all, applies only to a small part of their analysis), none of the above criticisms 
are in any sense fundamental. Then too, the existence of shortcomings would almost 
certainly characterize any attempt to treat comprehensively any of the major policy 
issues of our day. In brief, a study of this quality would be a welcome addition to 
the literature on any public policy issue. It is particularly welcome when contrasted 
to the institutional descriptions that all too commonly pass for economic analyses of 
transportation problems. 


Work Sampling for Modern Management. Bertrand L. Hansen. Englewood Cliffs, New 
Jersey: Prentice-Hall, Inc., 1960. Pp. xvii, 263. $7.50 Trade, $5.65 Text. 


Frank J. Witurams, San Francisco State College 


xX A time when certain traditional methods of measuring labor operations are ineffi- 
cient and costly, this book considers sampling as an aid in gathering information 
about work and delay, and in measuring and setting standards for both productive 
and non-productive work elements. As might be expected in a book which attempts 
to “meet the needs of undergraduates, advanced students, and persons in the field,” 
the presentation is somewhat uneven. 

Written in three parts, the first offers an “unsophisticated, step-by-step procedure 
for making a work sampling study” which seems all right for untrained industrial 
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management, staff, and supervisory personnel. But largely because of the heavy- 
handed attempts to “convince” the obdurate that work sampling really works, this 
part, said to be for undergraduate students also, fails to come up to classroom stand- 
ards. It is doubtful that intelligent undergraduates either in industrial engineering or 
statistics, or their instructors, would be satisfied at being included with the “how to” 
reader who is assured that the sample size given by a nomograph will “usually be 
satisfactory,” and that if still doubtful, “he should get someone with a probability 
bent to set the right ‘theoretical’ sample size for him.” Both the quality of the writing 
and the content of this part seem better suited to the shop and trade school than to 
the college classroom. 

Although the rest of the book reads somewhat like Factory Management and Main- 
tenance, it is better than the first section. Part Two contains seven articles, four of 
which were published earlier in industrial magazines, dealing with applications of 
work sampling in such areas as plant maintenance, materials handling, and clerical 
activities. Part Three is also a collection of (four) papers including J. A. C. Williams’ 
“Work Sampling Techniques in Work Study,” and A. C. Rosander’s good report of 
his work to a Cleveland meeting of the ASQC in 1958, “Random Time Sampling 
Applied to Work Activities for Cost Estimation and Control.” Two articles by the 
author, “A Graphic Method for Finding Required Number of Time-Study Readings,” 
and “Waiting-Line Analysis and Work Sampling,” which follows closely the earlier 
work of D.C. Palm as reported in The Journal of Industrial Engineering, close the book. 


The Michigan Economy: Its Potential and its Problems. William Haber, Eugene C. 
Mc Kean, and Harold C. Taylor. Kalamazoo, Michigan: The W. E. Upjohn Institute for 
Employment Research, 1959. Pp. xv, 395. $3.25. 


Hersert FE. Striner, The Brookings Institution 


HE chief asset of this book is that it brings to the reader a handy single source of 
T: large amount of data concerning the economy of the State of Michigan. In the 
body of the text as well as in the copious appendix, it emphasizes the many facets 
which go to make up the structure of a large economic entity such as Michigan. The 
shortcomings of the book result from a default in methodology, excessive repetition, 
and too little concern with new data programs which the work indicates are badly 
needed. 

The type and amount of data which this book has brought together on the econ- 
omy of Michigan reflect a great deal of painstaking research. Fruitful use of the data 
will depend, however, on how much disaggregation is required. The data included 
in the study should be helpful in the analysis of 2-digit industries. 

Unfortunately the authors have made no use of input-output techniques which 
could have been most helpful in their analysis. For example, when references are 
made to employment which is affected by cuts in the defense end automotive indus- 
tries (pp. 90 and 100), only the direct impact is considered; no reference is made to 
the obvious value of input-output analysis when both direct and indirect effects of 
changes in final demand on producing sectors are relevant. 

Though the chief culprit of Michigan’s underemployed economy is shown to be the 
gyration of the demand for autos, the authors see additional auto demand and a 
share in defense industries as contributing to a solution. Diversification in some other 
direction, however, might be more appropriate. They need only look at the California 
experience in aircraft production to realize that at least Michigan has been saved 
that headache. 

With respect to the role which taxes, unions and fiscal policy play in enhancing a 
business environment, this book has a number of eminently sensible things to say. 
The tax analysis in the text is supported by an excellent appendix by Harvey E. 
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Brazer on “Taxation and Industrial Location in Michigan.” The appendix by the 
Fantus Locating Service on transportation is not of as high a quality. 

The authors very wisely recommend a Governor’s Council of Economic Advisers 
and an economic report. Both could go a long way towards inducing more rational 
economic development programs as well as the sort of supporting research program 
in which Michigan’s universities could participate. It’s unfortunate that the recom- 
mendations did not include a “roughed-out” data program which the State could 
consider immediately. The cost would be slight relative to the value of the economic 
analysis it would permit. The “Program of Action and Study” discussed on pp. 53-56 
is a good starting point for a “Temporary Economic Development Commission” 
which could be established quickly. 

Worth special mention is the authors’ suggested retraining program, which would 
move the unemployed out of the automotive industry into other industries. Depressed 
areas and states often talk about this, but never seriously consider implementing the 
suggestion. With increasing technological unemployment as a possibility, manage- 
ment, labor and government had better begin to get together on realistic retraining 
prograins or face the prospects of supporting a needlessly large number of unemployed 
in the future. 


Blue Collar Man: Patterns of Dual Allegiance in Industry. Theodore V. Purcell. Cam- 
bridge, Massachusetts: Harvard University Press, 1960. Pp. xviii, 300. $6.00. 


MartTIN PatcHen, University of Michigan Survey Research Center 


HE dual allegiance of the worker as an employee and as a unionist is examined in 
this book. Father Purcell has studied differences in the attitudes and behavior of 
packinghouse workers employed by the same company (Swift) but located in three 


different cities and belonging to three different unions. 

The bulk of the data reported are based on interviews conducted by the author. 
A random sample of hourly paid employees, stratified by sex, race, and service, was 
interviewed. Twenty attitudes concerning subjects such as the wage incentive sys- 
tem, foremen and union leaders were isolated from each interview and graded on a 
nine-point scale of favorability. (The author reports that an unspecified “high degree” 
of reliability among three interview analysts was achieved in making these ratings.) 

Comparisons are made of the attitudes of employees in the three cities—Chicago, 
Kansas City, and East St. Louis. Within each plant, the attitudes of men and 
women, Whites and Negroes, and persons of varying lengths of service are compared. 
The viewpoints of union leaders and of foremen are also reported. Differences in the 
behavior of workers at the three plants are shown by data on turnover, absence and 
participation in union affairs. The author places the data on worker attitudes and 
behavior in the context of interesting accounts concerning the relations of each of 
the three unions with Swift and Co., the distinctive urban settings of the three plants, 
and the post-World War II strikes in the meat packing industry. 

In presenting these materials, Father Purcell has achieved a happy blend of the 
quantitative and the qualitative. Abundant quotations from interviews give depth 
and greater meaning to extensive quantitative materiai. In fact, many readers will 
find the excerpts from actual interviews to be the most rewarding portion of the book. 
The fact that Father Purcell spent ten years studying the packinghouse workers adds 
a valuable time dimension to this study—a feature which is lacking from a number of 
“one-shot” industrial studies. Moreover, some of the findings which the author pre- 
sents are of considerable interest—e.g., that those employees with the most favor- 
able attitudes toward the company were also most strongly behind the union. 

What is most lacking in this study is a satisfactory explanation of the differences 
which the author finds in the attitudes and behavior of employees at the three plants. 
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Why are those at Chicago least satisfied and those at East St. Louis most satisfied? 
The variables of race, sex, and length of service, which the author systematically 
uses, are by his own judgment, “not very important” in explaining these attitude dif- 
ferences (p. 164). He suggests that the different philosophies of union leaders, the 
different role of the Negro in the different union locals, and the special problems of 
Chicago as a metropolis may be important explanatory factors. But these explana- 
tions remain vague and have little data—either quantitative or qualitative—to sup- 
port them. 

A fundamental shortcoming of the study is the weakness of the key explanatory 
concepts of company allegiance, union allegiance, and dual allegiance. The concept 
of company allegiance appears amorphous and imprecise. It is not intended to mean 
loyalty to the company. It is said to mean, instead “the worker’s degree of approval 
of the company as an institution” but is measured by answers to the question, “Put- 
ting it all together, what would you say about Swift as a place to work?” (p. 59). 
Company allegiance seems to represent no more than a rough average of general 
satisfaction—or, in the author’s terms, “a summation of the workers’ attitudes toward 
job, pay, standards, gang, foreman and so forth” (p. 59). The limited usefulness of 
the concept, as used in this study, is indicated by the fact that company allegiance, 
as measured, bears no consistent relation to worker behavior. It is true, as Father 
Purcell emphasizes, that Chicago workers, who had the least company allegiance, also 
had the most absences. But Chicago workers also had the least turnover. Moreover, 
as the author acknowledges, East St. Louis workers, who had the strongest “company 
allegiance,” went on strike while Kansas City workers, with less company allegiance, 
stayed on the job. 

The definition of union allegiance as “belief in the necessity and existence of a union 
in the plant” (p. 167) is more clear than the concept of company allegiance used. 
Moreover, union allegiance, as measured, shows positive relations to some indicators 
of union participation—like voting in union elections. It may be noted, however, 
that there is a lack of parallel in the meanings of “allegiance” as it is applied to the 
union and as it is applied to the company. 

The subject of dual allegiance in industry, which forms the focus of Father Purcell’s 
book, is an important one. But unfortunately the nature of the “dual allegiance” at 
Swift remains cloudy. Does it represent a real division cf loyalty or adherence to dif- 
fering values on the part of packinghouse workers? The data presented lead one to 
wonder. In the crisis situation of a strike, the main uncertainty for most employees 
seemed to be whether they could hold out economically for a long strike period. If 
dual allegiance does represent a real division of loyalty, what is the nature of this 
division? What specific effects on behavior are exerted by varying amounts of dual 
allegiance? And what determines how dual allegiance will be resolved in a conflict 
situation? 

This study tells us disappointingly little about these matters. It does, however, 
provide us with a well-written and frequently stimulating account of how workers in 
one industry feel about their company and about their union. 


One-Tenth of a Nation: National Forces in the Economic Growth of the New York 
Region. Robert M. Lichtenberg, with Supplements by Edgar M. Hoover and Louise P. Lerdau. 
Cambridge: Harvard University Press, 1960. Pp. xvi, 326. $6.75. 


Donatp L. Foutry, University of California, Berkeley 


_ book is the seventh in the valuable series of nine volumes analyzing the likely 
changes in New York’s economy over the next 25 years. (A tenth book, but not 
numbered in the series, comprises a technical supplement, Projection of a Metropolis.) 
Lichtenberg in this volume under review undertakes the challenging task of providing 
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an informed projection of that sector of the New York Metropolitan Region’s econ- 
omy that competes widely throughout the country, namely, “national-market” 
activities. This sector stands in contrast to the sectional-market and the local-market 
sectors. Of the New York Region’s 6.7 million employed persons, about 2.4 million 
(or 38 per cent) are estimated to be involved in the national-market sector. 

Implicit in this emphasis on the nationel-market phase of the Region’s economy 
is the reasoning, although only briefly developed by the author, that the likely future 
growth of this exogenous or independent sector is more important as a basis for pro- 
jecting the entire Regional economy than are the remaining sectional- and local- 
market sectors which are more fully dependent, in turn, upon the growth of the Re- 
gion. One is reminded of the basic and service categories so characteristic of much 
previous economic base analysis. It is to the credit of the team working on this New 
York Region study that a spirit of sophisticated caution marks their overall ap- 
proach. In line with this the author states with care and considerable refinement his 
conception of the complexities of the interplay among the sectors designated. This 
would seem to mark an advance over most previous comparable studies. 

The most important contributions of this book are the conceptual approach and 
the corresponding method of statistical analysis. Once Lichtenberg has succeeded in 
identifying national-market activities, he approaches the projection of these activities 
for the Region by analyzing (a) the “mix” of such activities and its effect on growth 
rates for the Region and (b) the “competition” of other regions for each kind of ac- 
tivity and the resulting impact on Regional growth. 

To examine New York’s mix of national-market industries, the author standardizes 
the growth rates of these industries so as to demonstrate how total employment in the 
New York national-market sector would have grown had each New York industry 
grown at the same rate it grew in the nation. New York is shown to be characterized 
by a favorable mix of these industries, with their weighted average growth rate 
higher than that for the nation as a whole. In short, if New York as a region were not 
to lose any portions of these industries to other regions in the U. §., it would tend 
to continue to increase its share of national-market industry employment. 

The book then goes on to compare the actual growth rate for the Region with the 
“expected” growth rate, as derived by the standardization process described above. 
The degree to which the Region has failed to keep up with its expected growth pro- 
vides a quantitative measure of the extent to which competition has drawn some in- 
dustrial segments to other regions. 

There follow detailed analyses of the factors that are most likely to influence future 
economic growth of the sector under consideration. Lichtenberg concludes that the 
two kinds of impact—of mix and of competition—will work largely to offset each other 
in projections to 1985. To quote: 


“_.. We anticipate that the Region will continue to have a fast-growing mix of 


employment in national-market activities compared to the nation, and will continue 
also to show a competitive weakness, by and large, in national-market industries 
and services. The net result, we expect, will be that the Region’s rate of employment 
in national-market activities will differ very little from the nation’s.” (pp. 179-80) 


The New York Metropolitan Region, it is judged, will particularly see a rise in empha- 
sis on service-type activities, on unstandardized products and services, and on estab- 
lishments that must make quick decisions and adapt quickly to changes. In this 
analysis, the main emphasis is or. manufacturing industries. But insofar as some other, 
nonmanufacturing activities were identified as involving national-market segments, 
these were also studied. Hence, a chapter on headquarters offices adds to the com- 
prehensiveness of Lichtenberg’s research. 
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In a supplementary chapter, Edgar M. Hoover explores “ . . . a number of differ- 
ent demographic and economic factors that have played a part in shaping the Re- 
gion’s population growth and structure and the size and character of its labor force.” 
(p. 221) He concludes that unlike earlier periods, when a large and diverse labor poo! 
played a considerable role in making New York particularly attractive for certain 
types of industry, it is likely that any excess in available employees will continue to 
diminish, and New York will no longer be able to claim this unique external economy 
of excess skilled and available labor. 

Louise P. Lerdau, supported by an array of empirical information, examines the 
ratio of employment in consumer trade and services to that for the nation and dis- 
cusses New York’s ability to maintain a strong competitive position with respect to 
these. In this second supplement, she presents the thesis that in general New York 
is becoming more like other metropolitan regions in the character of its retail trade, 
but that in some spheres of unstandardized offerings and services (e.g., as an art cen- 
ter and as a commercial theater center) New York has been, and may be expected 
to remain, successful in preserving very special advantages and hence a most marked 
dominance. 

The text of the book is apparently intended to reach a broad range of potential 
readers (and not primarily statisticians), but this is supplemented by various Ap- 
pendices comprising methodological notes. In one presentation 446 manufacturing 
industry categories are allocated according to “dominant locational characteristic,” 
by inertia, transport costs, labor costs, external economies, and unclassified. 

This is a solid study. Census data are utilized with patience and processed by re- 
sourceful yet essentially straight-forward statistical methods. (No problems of sta- 
tistical inference arise.) As with the other books in the series, a sense of good judg- 
ment is always sufficiently ascendant so that at no point do the statistics either seem 
to dictate or to stray far from a full interpretative discussion. The conceptual ap- 
proach and the accompanying modes of statistical analysis will undoubtedly have a 
strong impact on the shape of metropolitan economic studies to come. It will be im- 
portant, too, to see whether and how the results of this study (along with those in the 
companion volumes) will be used in formulating public policies and plans so as intel- 
hgently to guide the future shape of the New York Metropolitan Region. 


An Introduction to Infinitely Many Variates. Enders A. Robinson. New York: Hafner 
Publishing Co., 1959. Pp. 132. $4.75. 


EMANUEL ParzEN, Stanford University 


HE problem of the organization of statistical instruction can be formulated from 

many points of view. One of its more important versions seems to me to be the 
following: to design a two-year sequence of courses to be taken by students of the 
physical, biological, and social sciences and engineering, who desire to treat the ran- 
dom aspects of the phenomena which they are studying. Among the many questions 
to be resolved are the role to be played in this program by subjective probability 
and the Bayesian approach to decision problems, probability theory and stochastic 
processes, and the approach to analysis of variance and time series analysis by means 
of linear space ideas. 

The author of the book urder review feels that the limit theorems of probability 
theory (other than the Law of Large Numbers and the Central Limit Theorem) and 
the Hilbert space structure of stationary time series deserve a place in the basic pro- 
gram of instruction of students of statistics. He has written a short, but fact-filled 
and well-written, book with the aim of giving “a concise presentation of probability 
theory, limit theorems, and stationary stochastic processes. . . . So that overall unity 
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would not be lost in details and lengthy description, many of the proofs and applica- 
tions are either outlined or left as exercises. Throughout the book the student is en- 
couraged to go to the references, which is the key toward making him research- 
minded.” 

The basic notions of measure theory and probability theory are presented in Chap- 
ters 1 and 2 (29 pages). The central limit problem (conditions for the convergence in 
law of a sequence of sums of independent random variables) and infinitely divisible 
laws are treated in Chapter 3 (15 pages). Chapters 4-6 (35 pages) are a clear self 
contained treatment of linear operators in Hilbert space and their spectral representa- 
tions. 

Chapter 7 (30 pages) treats (wide sense) stationary stochastic processes as families 
of elements in a Hilbert space. The author gives a thorough treatment of the spectral 
decomposition theorem and the Wold decomposition theorem. One notion of which 
the author makes good use and which is not even mentioned in most books on sta- 
tionary processes is that of subordination, introduced by Kolmogorov in 1941; if 
zi(t) and 22(t), defined for =0, +1, - ++, are a multiple stationary time series, then 
x(t) is subordinate to 2,(t) if the Hilbert space spanned by { x2(t), t=0, +1,--- } 
is contained in the Hilbert space spanned by { z(t), t=0, +1,--- }. References are 
given to work on statistical inference on time series; in addition the last five pages of 
the text discusses the author’s own work on wavelet theory. Fifteen pages of exer- 
cises, three pages of references, and a four-page index conclude the book. 


Theory of Differential Equations. Andrew R. Forsyth, New York: Dover Publications, 
Inc., reprint, 1959. Three Volumes. Pp. 2766. $15.00. 


Pau Brock, Stanford Research Institute 


Ds Publications has photographically reproduced as a three-volume set this 
six-volume opus of Andrew Russell Forsyth. The original volumes were published 
by Cambridge University Press during the years 1890 to 1906. 

Forsyth conceived the work in four parts: I: Exact Equations and Pfaff’s Problem. 
II: Ordinary Equations, Not Linear, (2 volumes). III: Ordinary Linear Equations. 
IV: Partial Differential Equations. (2 volumes). 

Dover volume I includes part I and the first volume of part II. Dover volume II 
contains the concluding volume of part II and part III. Dover volume III contains 
both volumes of part IV. The reproduction includes the original title pages, prefaces, 
tables of contents, and indices. 

The reprint set, technically, is excellent, and should prove of great value to the 
scientific community. 


An Introduction to the Calculus of Finite Differences and Difference Equations. Ken- 
neth S. Miller. New York: Henry Holt and Co., Pp. viii, 167. $4.50. 


Gorpon E. Latta, Stanford University 


LTHOUGH designed for a one semester course in the fundamentals of the theory of 
finite differences, the text actually comes close to being useful as a reference for 
some of the more important topics used in the application of difference equations. 
The first chapter develops the theory of finite differences paralleling the develop- 
ment of the differential calculus, and includes the factorial polynomials, Stirling num- 
bers, and summation by parts. Chapters two and three are devoted to infinite prod- 
ucts, gamma and beta functions, and Bernoulli polynomials, including applications 
to the Euler Mac Laurin summation formula. The fourth and final chapter deals 
with linear difference equations in the real domain, and contains the most important 
theorems on this topic. 
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The text is rigorous, and contains many valuable techniques and ideas. It is un- 
fortunate that there is no apparent motivating connection between the chapters, but 
this does not detract from the usefulness of the book. 


Elementary Analysis: A Modern Approach. H. C. Trimble and Fred W. Lott, Jr. Engle- 
wood Cliffs, New Jersey: Prentice-Hall, Inc., 1960. Pp. xii, 621. $6.95. 


Howarp E. CampBetu, Michigan State University 


r 1s stated in the preface that the book was written as a text for a one year pre- 
calculus course. Also mentioned there, is that certain topics such as combinations, 
permutations and the binomial theorem are omitted because they should be in a pre- 
statistics course. The present reviewer feels that the binomial theorem should be in 
a pre-calculus course and more generally that a proper pre-calculus course and cal- 
culus should eliminate the need of a pre-statistics course. 

The book begins with a discussion of the elementary number systems and includes 
material on their postulational development. Cartesian products are then used to 
define relations and functions which leads to a consideration of the elementary func- 
tions. There follows material on elementary algebra, derivatives, solution approxi- 
mation, and graphs of second degree equations in two variables. The book continues 
with the solution of linear systems of equations by determinants (rather than the 
simpler and more complete method of reducing to echelon form) and some elementary 
material on matrices. Next come chapters on logarithms, trigonometry and two- 
dimensional analytic geometry including some mention of n dimensional and non- 
Euclidean geometries. The last chapter contains a discussion of several simple ab- 
stract mathematical structures. There are many sets of exercises appropriately dis- 
tributed throughout the text and two lists of references. 

The authors do a good job of presenting many advanced concepts, in addition to 
communicating the necessary manipulative techniques, by giving for the most part, 
clear detailed discussions of ideas with many examples. 

The book is worthy of consideration as a text in a pre-calculus course, but one 
should bear in mind that when it is used, the instructor should be more mathe- 
matically advanced and mature than most graduate students. This is not only due to 
the presence in the text of advanced ideas, but because, often, definitions of terms 
are not made prominent or given in a concise way, and because more motivation for 
considering number pairs and postulates should be presented. 


Calculus of Functions of One Argument. Edward J. Cogan, Robert Z. Norman, and 
Gerald L. Thompson. Englewood Cliffs, New Jersey: Prentice-Hall, Inc , 1960. Pp. x, 587. 
$8.50. 


Me tcuer P. Foses, The College of Wooster 


fu thorough and excellent text contains more than enough material for a one- 
year introductory course in calculus, together with the necessary material on 
analytic geometry and trigonometry. Although it offers no statistical applications, it 
provides in highly rigorous form the beginning of a statistician’s mathematical train- 
ing. The topics treated are essentially those of the usual texts on the calculus, with 
the omission of solid analytic geometry, and partial differentiation and multiple inte- 
gration, which occur only very briefly in connection with exact differential equations. 

Although the topics treated are conventional, the handling of them is often differ- 
ent. The opening chapters introduce sets, real numbers, induction, sequences, func- 
tions (carefully defined as mappings), limits (including e and 6-definitions and proofs), 
and continuity (with statements and proofs of most of the fundamental continuity 
theorems). The notational distinction between a function and its values is made ex- 
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plicit throughout the text by the use of boldface type for functions. In particular the 
choice of the symbol x for the identity function keeps the notation similar to the con- 
ventional and yet preserves the distinction between the function x? and its value, 2°, 
for some unspecified real number z. 

Theorems on differentiation and integration are carefully proved, and include a 
somewhat unusual proof of the chain-rule formula, to take care of the case usually 
slighted, and a proof of the existence of the definite integral of a continuous function. 
The exponential function is introduced as a solution of the differential equation Df =f, 
and the logarithm function as its inverse. Linear homogeneous differential equations 
of order n with constant coefficients make their early appearance immediately after 
this, and this treatment is supplemented in the final chapter by a more thorough dis- 
cussion of linear equations and the usual types of first-order equations. 

All essential analytic trigonometry appears in brief in the chapter on the calculus 
of the trigonometric functions, and plane analytic geometry is discussed through 
conics with axes parallel to the coordinate axes. There is also a chapter on infinite 
series, including the usual convergence tests, power series, and Taylor’s theorems. 

This is a well-balanced book. It gives prime importance, as it should, to theory, 
but it does not neglect applications, a fair-sized collection of problems, nor intuitive 
motivation in most cases where it would be helpful. Because of its compactness and 
rigor, it might prove somewhat formidable to a class of mixed abilities, but it should 
set a class of able students off to a first-rate start on their mathematical careers. 


Proceedings of the 1959 Computer Applications Symposium. Armour Research Founda- 
tion. Illinois Institute of Technology, 1960. Pp. x, 155. $3.00. Paper. 


HIs conference was the sixth such event with papers devoted to business, manage- 
ment, engineering and scientific applications. The reports deal with methodology 
for exploiting the capabilities of specific machines to solve research problems. Included 

is a paper on computer technology in the U.S.8.R. 
I. O. 


Historisk Statistik for Sverige (Historical Statistics of Sweden). Central Bureau of Sta- 
tistics. Tables not published in Volumes I and II. Stockholm: Central Bureau of Statis- 
tics, 1960. Pp. 18, 284. No price listed. 


wo previous volumes have been published on historical statistics of Sweden. 

Preparation of additional data covering other areas has been so time-consuming 
that the present publication has been issued, essentially as an interim report, to pro- 
vide tentative estimates while the more complete compilation is carried on. 

The series in the present publication include data up to 1950, taken mostly from 
the Swedish Statistical Abstract for 1951 and prior years. Areas covered include: 
mining and manufacturing production, foreign trade, transportation and communi- 
cations, banking and insurance, prices, cost-of-living and consumption expenditures, 
housing and construction, wages and salaries, social welfare, education, government 
operations, national income, and elections. Although an interim publication, the vol- 
ume could have benefitted from more explicit statement of sources and descriptive 


notes. 
R. F. 


Bibliography on Income and Wealth, Vol. VII, 1955-56. Phyllis Deane, Editor. Chicago: 
Quadrangle Books, 1960. Pp. 131. (No price given) 


H1s volume is the seventh in the series of bibliographies by the International 
Association for Research in Income and Wealth. References are given to material 
published in 1955 and 1956 in the fields of social accounting, international compari- 
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sons of national income and wealth, and problems of statistical methodology and 
model building. Coverage is extremely broad, and the bibliography is notable for its 
inclusion of many items dealing with Soviet Bloc economies. 







M. N. 





Economic Atlas of the Soviet Union. George Kish. Ann Arbor, Michigan: University of 
Michigan Press, 1960. Pp. 96. $10.00. 





A“ INTRODUCTORY section presents descriptive features of the economic geography 
of Soviet Russia with the aid of maps of physical features, vegetation zones, 
administrative divisions, air transportation and population. The description is con- 
tinued with brief sections devoted to each of fifteen regions into which the author has 
divided the country. Each of these sections contains four maps showing agriculture 
and land use in a particular region, mining and minerals, industry, and transportation 
and cities. A page of text accompanies each set of maps. Although considerable in- 
formation is presented in a condensed and convenient form, the addition of tables 
of population, production of various categories of goods, and such other commonly 
used information as might be estimated on a city and regional basis would probably 
have enhanced the usefulness of the volume to many. C. H. 












The Alcoholic Psychoses: Demographic Aspects at Mid-Century in New York State. 
Benjamin Malzberg. New Haven, Connecticut: Yale Center of Alcoholic Studies, 1960. 
Pp. ix, 46. $2.00. Paper. 






HIS volume consists of a set of tabulations and analyses of the first admissions for 
Tistcchotie psychoses to New York State hospitals for mental disease by character- 
istics of interest. When possible, the number of first admissions is compared to the 
appropriate population base. Characteristics covered include the standard demo- 
graphic variables, age, sex, rural-urban environment, marital status, economic status 
and occupation, education, racial and ethnic differences, nativity and migration 
status, as well as the somewhat less usual variables, personality, drug use, type of 
psychosis, and duration of attack. Each table is accompanied by a brief verbal sum- 
mary of its contents and interpretation of its salient findings. Information on basic 
data sources, definition of terms, and other aspects of the methodology is sparse. 
J.C. 













Statistics of Sources and Uses of Finance, 1948-58. Organization for European Economic 
Cooperation, Paris: 1960. Pp. 195. $2.50. 





INANCIAL statistics assembled in this volume provide a systematic survey of 

money and credit developments in O. E. E. C. countries from 1948 through 1958. 
These are integrated in the general framework of the national income accounts and 
are, as nearly as possible, comparable from year to year and country to country. The 
volume thus provides a useful supplement to the series on national product and ex- 
penditure regularly published by O. E. E. C. and is an aid in the analysis of balance- 
of-payments and a variety of monetary problems. M.N. 










Statistics of Deadly Quarrels. Lewis F. Richardson. Chicago: Quadrangle Books, 1960. 
Pp. xxix, 373. $12.50. 





Arms and Insecurity. Lewis F. Richardson. Chicago: Quadrangle Books, 1960. Pp. xxv, 
307. $10.00. 







HESE two volumes establish Richardson as an important precursor of von Neu- 
mann and Morgenstern! and Schelling? in the mathematical analysis of conflict. 
Richardson’s approach to the problem of conflict, however, is quite different than 
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more recent work, being considerably more descriptive and deterministic in character 
rather than prescriptive as is the latter. 

Statistics of Deadly Quarrels consists of a painstaking analysis of a mass of historical 
data on wars and other conflicts and careful study of the causes of these conflicts. 
The exhaustive list of wars and fatal quarrels from 1820 to 1949, together with a pre- 
cise statement of their conditions and presumptive causes and estimates of the fatali- 
ties involved, will prove highly useful to future investigators. 

In the comparison volume, Arms and Insecurity, Richardson presents a large num- 
ber of mathematical models of conflict and tests these models using data assembled in 
the aforementioned volume as well as additional material. Almost all of Richard- 
son’s models consist of systems of differential equations and thus constitute what 
might be called deterministic models of conflict behavior. In this sense, Richardson 
represents an older tradition in the application of mathematics in the social sciences, 
but he should not therefore be neglected. 

Both volumes contain excellent introductions to Richardson’s work: by Quincy 
Wright and C. C. Lienau in the first, and by N. Rashevksy and Ernesto Trucco in 
the second. 

M. N. 


1 J. von Neumann and O. Morgenstern, Theory of Games and Economic Behavior, 2nd ed., (Princeton: Princeton 
University Press, 1947). 
2T. C. Schelling, The Strategy of Conflict (Cambridge: Harvard University Press, 1960). 


The Post-Enumeration Survey: 1950. An Evaluation of the 1950 Census of Population 
and Housing. U.S. Department of Commerce, Bureau of the Census. Technical Paper No. 4. 
Washington: Bureau of the Census, 1960. Pp. vi, 93. $1.00. Paper. 


HIS report presents an analysis of the results of the Post-Enumeration Survey 

(PES) of the 1950 Census of Population and Housing. The survey was undertaken 
to check the completeness of enumeration and the accuracy of thé reporting of these 
two Censuses. Chapter Headings are: I—Introduction and Summary; II—Errors in 
the Population Count; IJI—Errors in the Count of Occupied Dwelling Units; 
IV—Classification Errors in Population and Housing Data; V—Detailed Tables; 
VI—PES Methods and Procedures. J.C. 
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Heizer, Robert F. and Cook, Sherburne F., 
Editors. The Application of Quantitative 
Methods in Archaeology. Chicago, Illinois: 

ne., 1960. $7.50. 


Quadrangle Books, 
Hickman, W. Braddock. Statistical Meas- 
ures of Corporate Bond Financing Since 
1900. Princeton, New Jersey: Princeton 


University Press, 1960. $9.00. 

Hilton, P. J. and Wylie, S. Homology The- 
ory: An Introduction to Algebraic Topol- 
ogy. New York: Cambridge University 
Press, 1961. $14.50. 

Hoffman, Kenneth and Kunze, Ray. Linear 
Algebra. reese Cliffs, New Jersey: 
Prentice-Hall, Inc., 1961. $10.00 Trade, 
$7.50 Text. 

Johnson, Roger A. Advanced Euclidean Ge- 
ometry (Modern Geometry) An Elemen- 
tary Treatise on the Geometry of the Tri- 
angle and the Circle. New York: Dover 
Publications, Inc., 1960. $1.65. Paper. 
(Reprint) 

Khintchine, A. Y. Mathematical Methods in 
the Theory of Queueing. (Number 7 of 
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Griffin’s Statistical Monographs and 
Courses, edited by M. G. Kendall.) New 
York: Hafner Publishing Company, 
1960. $5.50. Paper. 

Kish, George. Economic Atlas of the Soviet 
Union. y onl Arbor, Michigan: Univer- 
sity of Michigan Press, 1960. $10.00. 

Leiby, James. Carroll Wright and Labor Re- 
form: The Origin of Labor Statistics. Cam- 
bridge, Massachusetts: Harvard Univer- 
sity Press, 1960. $4.75. 

MacDougall, Sir Donald. The Dollar Prob- 
lem: A Reappraisal. (Number 35 of Es- 
says in International Finance.) Prince- 
ton, New Jersey: Princeton University 
Press, 1960. Price not listed, paper. 

MacMahon, Percy A. Combinatory Analy- 
sis. (Two volumes bound as one.) New 
York: Cheisea Publishing Company, 
1960. $7.50. 

Malzberg, Benjamin. The Alcoholic Psy- 
choses— Demographic Suspects at Midcen- 
tury in New York State, New Haven, 
Connecticut: Publications Division, Yale 
Center of Alcohol Studies, 1960. $2.00. 
Paper. 

Menzler, F. A. A. The First Fifty Years 
1910-1960. Kent, England: Staples 


Printers Ltd., 1960. 25/-. 

Moore, Shirley, Editor. Science Projects 
Handbook. Washington, D. C.: Science 
Service, Inc., 1961. $.50. Paper. 

Moore, Wilbert E. and Feldman, Arnold S., 
Editors. Labor Commitment and Social 
Change in Developing Areas. New York: 


aa Science Research Council, 1960. 

3.75. 

Murphy, T., Norris, K. P., and Tippett, 
L. H. C. Statistical Methods for Textile 
Technologists. Manchester, England: The 
Textile Institute, 1960. Price not listed, 
paper. 

National Bureau of Economic Research. 
Trends in the American Economy in the 
Nineteenth Century. (Studies in Income 
and Wealth—Volume 24.) Princeton, 
New Jersey: Princeton University Press, 
1960. $15.00. 

National Science Foundation. Federal Funds 
for Science: IX. The Federal Research 
and Development Budget, Fiscal Years 
1959, 1960, and 1961. Washington, D. C.: 
United States Government Printing Of- 
fice, 1960. $.50. Paper. 

Nelson, Boyd L. Elements of Modern Sta- 
tistics. New York: Appleton-Century- 
Crofts, Inc., 1960. $6.00. 

Nixon, J. W. A History of the Interna- 
tional Statistical Institute 1885-1960. The 
Hague, The Netherlands: International 
Statistical Institute, 1960. $2.40. Paper. 

Palmer, Edgar Z., Editor. City and Re- 
gional Wage Comparisons. (Business Re- 
search Bulletin Number 64.) Lincoln, 
Nebraska: University of Nebraska, 1960. 
Price not listed, paper. 

Pillai, K. C. Sreedharan. Statistical Tables 
for Tests of Muitivariate Hypotheses. Ma- 
nila, The Philippines: The Statistical 
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Center, University of The Philippines, 
1960. Price not listed. 

Raiffa, Howard and Schlaifer, Robert. Ap- 

a Statistical Decision Theory. Boston, 
assachusetts: Division of Research, 
Harvard Business School, 1961. $9.50. 

Rapoport, Anatol. Fights, Games, and De- 
bates. Ann Arbor, Michigan: The Uni- 
versity of Michigan Press, 1960. $6.95. 

Roll, James A. Mortgage and Real Estate 
Investments of Nebraska’s Domestic Life 
Insurance Companies, 1947-56. (Busi- 
ness Research Bulletin Number 65.) Lin- 
coln, Nebraska: University of Nebraska, 
1960. Price not listed, paper. 

Schloss, Samuel and Hobson, Carol Joy. 
(United States Department of Health, 
Education and Welfare) Statistical Sum- 
mary of State School Systems 1957-58. 
Washington, D. C.: United States Gov- 
ernment Printing Office, 1960. $.15. 
Paper. 

Schmeckebier, Laurence F. and Eastin, 
Roy B. Government Publications and Their 
Use, Revised Edition. Washington, D. C.: 
The Brookings Institution, 1961. $6.00. 

Sharp, Henry, Jr. Modern Fundamentals of 
Algebra and Trigonometry. Englewood 
Cliffs, New Jersey: Prentice-Hall, Inc., 
1961. $8.65 Trade, $6.50 Text. 

Silk, Leonard S. The Education of Business- 
men. New York: Committee for Eco- 
nomic Development, 1960. No Charge. 
Paper. 

Slichter, Sumner H., Healy, James, J., and 
Livernash, E. Robert. The Impact of 
Collective Bargaining on Management. 
Washington, D he Brookings In- 
stitution, 1960. $8.75. 

Solodovnikov, V. V. Jntroduction to the 
Statistical Dynamics of Automatic Con- 
trol Systems. New York: Dover Publica- 
tions, Inc., 1960. $2.25. Paper. (Trans- 
lated from First Russian Edition.) 

Solomon, Herbert, Editor. Mathematical 
Thinking in the Measurement of Be- 
havior. Glencoe, Illinois: The Free Press 
of Glencoe, 1961. $7.50. 

Southworth, Constant and Buchanan, 
W. W. Changes in Trade Restrictions Be- 
tween Canada and the United States. 
Washington, D. C.: Canadian-American 
Committee, 1960. $2.00. Paper. 

Stanton, Ralph G. Numerical Methods for 
Science and Engineering. Englewood 
Cliffs, New Jersey: Prentice-Hall, Inc., 
1961. $9.00 Trade, $6.75 Text. 

Stolper, Wolfgang. The Structure of the 
East German Economy. Cambridge, Mas- 
sachusetts: Harvard University Press, 
1960. $10.00. 

Supes, Patrick and Atkinson Richard C. 

arkov Learning Models for Multiperson 
Interactions. Stanford, California: Stan- 
ford University Press, 1960. $8.25. 

Survey and Research Corporation, the Sta- 
tistical Advisory Group. Better Statistics 
in Korea. Seoul: Surveys and Research 
Corporation, 1960. Price not lited. Paper. 





PUBLICATIONS RECEIVED 


Tak&cs, Lajos. Stochastic Processes: Prob- 
lems and Solutions. New York: John 
Wiley and Sons, Inc., 1960. $2.75. 

Tolley, G. S. and Riggs, F. E., Editors. 
Economics of Watershed Planning. Ames, 
a: Iowa State University Press, 1961. 


United States, Contracting Parties to the 
General Agreement on Tariffs and Trade. 
International Trade 1959. Geneva: United 
Nations, 1960. $2.00. Paper. 

United States Bureau of the Budget, Office 
of Statistical Standards. 1960 Supple- 
ment to Economic Indicators Historical 
and Descriptive Background. Washington, 
D. C.: United States Government Print- 
ing Office, 1960. $.60. Paper. 

United States Bureau of the Census. The 
Post-Enumeration Survey: 1950. Wash- 
ington, D. C.: Bureau of the Census, 
1960. $1.00. Paper. 

United States Department of Health, Edu- 
cation, and Welfare. Social Security Ad- 
ministration. A Report on Social Secur- 
ity Programs in the Soviet Union. (Pre- 
pared by the United States team that 
visited the U.S.S.R. under the East-West 
Exchange Program in August-September 
1958.) Washington, D. C.: United States 
Government Printing Office, 1960. $1.00. 
Paper. 

United States Department of Labor, Bu- 
reau of Labor Statistics. Collective Bar- 
gaining in the Basic Steel Industry. 
Washington, D. C.: United States Gov- 
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ernment Printing Office, 1961. $1.25. 

Paper. 

United States Joint Economic Committee 
of Congress. 

A Study of the Dealer Market for Federal 
Government Securities. $.40. 

Economic Policies for Agriculture in the 
1960's. $.25. 

Economic Programs for Labor Surplus 
Areas in Selected Countries of Western 
Europe. $.25. 

Seen. Growth, and Price Levels. 

.30. 


Energy Resources and Government. $2.00. 

Subsidy and Subsidylike Programs of the 
United States Government. $.25. 

Washington, D. C.: United States Gov- 
ernment Printing Office, 1960. Paper. 

United States Joint Economic Committee of 
Congress. Current Economic: Situation 
and Short-Run Outlook. Washington, 
D. C.: United States Government Print- 
ing Office, 1961. $.70. Paper. 

United States National Health Survey. 
Department of Health, Education, and 
Welfare. 

Health Statistics: 
Interviews. $.25. 

Health Statistics: Interim Report on 
Health Insurance. $.45. Washington, 
D. C.: United States Government 
Printing Office, 1960. Paper. 

Watson, G. L. Jntegral Quadratic Forms. 
New York: Cambridge University Press, 
1961. $5.00. 


Hernias Reported in 
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The American Statistical Association has engaged in a rather wide variety of activities 
during 1960, ranging from the exploration of several new fields to the closer collaboration 
with sister societies. An activity in which the Board, the Council and various members 
have been consulted and to which they have given direction has been the conference of 
societies interested in statistics. As a result of considerable discussion with other societies 
and much spade work by Rensis Likert, the 1959 President, and Morris Hansen, Presi- 
dent in 1960, the first meeting to investigate closer collaboration among statistical and 
certain other societies was held this year at the Onchiota Conference Center, Sterling Forest, 
Tuxedo, New York, with the aid of a grant from the Rockefeller Foundation. 

The societies sending representatives included: American Society for Quality Control, 
American Statistical Association, Biometric Society (Eastern North American Region 
and Western North American Region), Econometric Society, Institute of Mathematical 
Statistics, Institute of Management Sciences, Operations Research Society of America, 
Psychometric Society and Society of Actuaries. Attending in an advisory capacity were 
representatives of the American Institute of Biological Sciences and the Conference Board 
of Mathematical Sciences. Both of these organizations have their own form of joint ar- 
rangement among societies in their particular areas of interest. 

The meeting covered the weekend of Spetember 16, 17, and 18, 1960. A total of 24 
persons took part in the discussions, representing the several societies and diverse view- 
points. 

Prior to the gathering, a memorandum was circulated, outlining some suggestions con- 
cerning different forms that closer collaboration and cooperation might take, as well as 
touching upon the kinds of different functions that might be performed collectively. The 
functions included the following area (but were not intended to be all inclusive): 


1. to serve as a focal point for collecting and disseminating professional information 
about statistics, statisticians, careers in statistics, teaching of statistics, etc.; 

2. to serve as a voice for the statistical community on matters of public relations; 

3. to serve as a source for advice in forming national delegations to international sta- 
tistical meetings, conferences, symposia, and so forth; 

. to serve as a source of high level advice on statistical problems of national impor- 
tance. This might mean the appointment of special committees to study or review 
statistical programs, etc., on request; 

. to serve as a coordinating agency in the planning of national or regional joint meet- 
ings of some or all statistical societies. This might include organizing conferences or 
symposia on problems of major interest to statisticians. 


Coming out of the conference was a fairly large number of suggestions for functions with 
which a council would be concerned. A few of these were: publishing a bulletin or news- 
letter and cooperation on editorial services; publishing of joint directories, translations, 
ete.; recruitment and placement registry; fellowship programs; international cooperation, 
and so forth. A small writing committee was named “to carry forward the suggestions of 
the meeting and provide that the participants be reconvened to take action on the work 


” 


of the writing committee. . . . 

The Board of Directors of ASA has encouraged closer cooperation and collaboration 
with all societies with an important interest in statistics, although there has been no firm 
commitment on the form such action should take. The expansion of statistical activities 
tends to engender specialized groups. A means for providing more communication among 
these groups in areas of common interest should enhance the advancement of statistical 
science and increase its contribution to the development of many fields. The Rockefeller 
grant, administered by ASA but in close collaboration with Biometrics (ENAR and 
WNAR) and the Institute of Mathematical Statistics, provides a small continuing fund 
for further exploration during the next two years. 
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COMMITTEE ACTIVITIES 

1. Early in 1960, the Special Subcommittee on Legislative Oversight of the Committee 
on Interstate and Foreign Commerce of the U. S. House of Representatives requested the 
American Statistical Association to arrange for an independent study to examine and 
evaluate the statistical methods used by the principal rating services in determining the 
ratings ascribed to television and radio programs. The Association appointed a Technical 
Committee on Broadcast Ratings, consisting of William Madow (Chairman), Herbert 
Hyman and Raymond Jessen. The Technical Committee is readying its report for presen- 
tation early in 1961. It was arranged that the report is to be independent and free from 
guidance by either ASA or the Congressional Committee. It is also understood that the 
Association has the privilege of publishing the report after its submission to the House 
Committee, if it so chooses. 

2. From time to time over the years, the system of electing Fellows of the Association 
has been re-examined. Recently, suggestions and criticism led to the creation of a new 
committee—the Committee to Review Procedures for Selection of Fellows. It is headed 
by A. Ross Eckler and all its members have served at some time on the Association’s 
Committee on Fellows. They represent a diversity of fields. In recommending establish- 
ment of this Committee, the Board noted that a considerable length of time would be 
needed to investigate the matter and submit suggestions for revised procedures. The 
Review Committee does not replace the Committee on Fellows, which will continue to 
elect Fellows yearly while the study is being conducted. 

3. Other new committees established in 1960 have been requested to investigate wider 
participation by statisticians; they include: Committee on Statistics in Marketing, Solo- 
mon Dutka, Chairman; Committee on Electronic Computers and Statistics, Richard 
Ruggles, Chairman; Committee on Applications in Management Science and Operations 
Reseach, Herbert Solomon, Chairman; Committee on Statistics in Meteorology, Herbert 
Thom, Chairman; Committee on Statistics in Accounting, Frederick Stephan, Chairman; 
and Committee on Audio-Visual Aids, Paul Clifford, Chairman. 


PUBLICATIONS 


With the cooperation of speakers and discussants at the Annual Meeting at Stanford 
in August 1960, the Proceedings of both the Business and Economic Statistics Section 
and the Social Statistics Section were printed promptly following the meeting. As usual, 
both volumes are being sold at a special rate to members of ASA. The Board this year 
voted to authorize the publication of the Social Statistics Section Proceedings on a per- 
manent yearly basis, as it had previously done in the case of the Business and Economic 
Statistics Section Proceedings. The former is in its third. year; the latter has been pub- 
lished since 1954. 

This year, work has started on the preparation of a new brochure on careers in sta- 
tistics. This is being sponsored jointly by ASA and the Institute of Mathematical Statis- 
tics. A professional writer and author, Michael Amrine, has been retained to compose the 
brochure, with advice and guidance from a committee appointed from both societies. 
At year’s end, the first full draft was under discussion and publication of the final brochure 
is expected in the first half of 1961. 


OTHER ACTIVITIES 


Two new Chapters were granted charters from the Board in 1960—the Arizona Chapter 
(centered around Phoenix) and the Harrisburg, Pa. Chapter. Both groups report excellent 
attendance at their meetings, with a high level of interest in Chapter activities. The 
addition of these new local groups brings the total list of Chapters to forty. During the 
year, the Oklahoma City Chapter reported a cessation of Chapter activities in that area. 
The Tulsa Chapter has been encouraged to widen its meetings and activities to include that 
area also, wherever possible. 

Future Annual Meetings now scheduled are: 


1961 New York City, Roosevelt Hotel, December 27-30 (Joint with the Allied 
Social Science Associations) 
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1962 Minneapolis, Hotel Leamington, early September 

1963 Cleveland, combined facilities of Case Institute of Technology and West- 
ern Reserve University (the campuses are adjacent) and Tudor Arms 
Hotel, early September 

1964 Chicago, Congress Hotel, December 27-30 (Joint with the Allied Social 
Science Associations) 

1965 Philadelphia, Bellevue-Stratford Hotel, early September 


Illustrating the increasing interest in statistics is the proposal for a separate Statistics 
Section in the American Association for the Advancement of Science. A number of societies 
have joined with ASA in requesting that AAAS give consideration to its establishment. 
Statistics is rapidly being recognized in many fields as an indispensable tool. ASA now 
maintains representation in Section K (Social and Economic Sciences) and Section A 
(Mathematics). Our interest in these sections would remain, but the new proposed section 
would aid in encouraging further adoption of statistical methods and applications. 


REPORT OF THE SECRETARY-TREASURER, 1960 


During 1960 the growth in the Association’s membership and subscriptions, the increase 
in sales, as well as other items of income, have all continued the upward curve that has 
characterized these yearly reports for more then a decade. (See the auditor’s report 
following, and Table I below, “Dues and Subscriptions Income 1949-1961.”) However, 
in that same period, the cost of all the Association’s activities and services has grown even 
faster. 

The last increase in dues came in 1948 (from five to the present eight dollars per year 
for regular members in the United States and Canada). As a result of that increase (and 
very strict economizing), the Association was able to build up a modest cumulative surplus 
fund over the years to serve as a cushion when expenses exceed income, and for use in 
launching new projects and activities (for example, the financial support to help start 
Proceedings volumes and Technometrics). As I have pointed out in previous year-end 
reports, the cost per page of printing the Association’s publications, salaries, office ex- 
penses and supplies—indeed, just about every single item on the expense budget—have 
increased almost every year. I have warned of this seemingly continuing trend for several 
years. 


TABLE I. DUES AND SUBSCRIPTIONS INCOME, 1949-61 


(Rates have remained the same during this period) 











Year Dues (Old & New Members) | Non-member Subscriptions 


1949 $32,400 $ 8,600 
1950 32,500 8,850 
1951 | 33 ,800 9,400 
1952 37,450 10,000 
1953 39,450 10,100 
1954 40 , 400 10,850 
1955 42,500 11,150 
1956 45,100 12,450 
1957 45 ,400 13 ,350 
1958 51,050 14,700 
1959 55 ,600 15,900 
1960 59 , 400 16,300 
1961* 64,000 17 ,000 











* Projected. 
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Table II shows the comparative cost of some representative expenses for the period 
1949-1961; we are publishing elsewhere a more detailed analysis of the problem. 

The picture for 1961 concerning expenses is somewhat worse. Projections for the forth- 
coming year show an expected loss of approximately $7,500. Obviously, we should not 
permit our reserve to be dissipated by these yearly losses. In addition to seriously weaken- 
ing the financial situation of the Association, it also shortchanges the prospects of expan- 
sion in future activities. In a period of gradual but sustained inflation, it is unrealistic 
to expect that dues should not be increased. If the Association is to provide the profession 
with adequate publications, meetings and other essential central services, funds must be 
available. We have been one of the most conservative professional societies in the United 
States in the matter of dues increases, but like many others a time to increase income 
finally comes. 

A special committee has been appointed by the President to investigate the facts and 
problems regarding an increase in dues; this committee will report its recommendations 
to the Board at its 1961 spring meeting. Further material concerning this will appear 
in The American Statistician. 

The membership continues to grow, as it has for the past decade. (Table III shows 
membership figures for the years 1949-60.) At the beginning of 1960, the members 
totaled 7,026. During the past year, 1,152 new members joined (a new high). From the 
list, 188 were removed through resignation or death and 475 were dropped for nonpayment 
of the 1960 dues. This leaves a net increase for 1960 of 486 and the Association starts 
1961 with 7,515 members. 

After two years, Technometr7cs has a subscription list considerably larger than originally 
projected when the magazine was first planned. By the end of 1960, over 2,500 copies 
were mailed to subscribers. These include members of the two sponsoring societies (ASQC 
and ASA) at the special member rate, and libraries, business and industrial organizations, 
etc., at the higher nonmember rate. The report by the auditor for the full year 1960 is 
attached, giving details on the financial operations of Technometrics. 

Going back to the Association, the amounts originally budgeted for income and expense 
for 1960 were $98,850 and $98,100, respectively; an approximately balanced budget. 
(See the year-end report by the auditing firm for full details.) 

Because of rising costs (including increases from the printers of both the Journal and 
The American Statistician), it now appears that 1961 will also be a deficit year. At present, 


TABLE II. REPRESENTATIVE EXPENSES, 1949-61 











Promotion 
for new 
Members 


American Rent Salaries One 


Year Journal ees 
Statistician Expense 





1949 $10,700 $ 3,700 $2 ,600 $18,600 $1,350 
1950 10 ,600 4,500 2,350 15,400 1,750 
1951 10 ,000 4,800 2,400 14,800 1,650 
1952 12,850 5,800 2,400 14,700 2,400 
1953 15,850 6 ,600 2,400 12,250 2,250 
1954 17,750 6,300 2,400 14,350 1,700 
1955 26 , 250 6,650 2,400 15,500 2,700 
1956 15 ,850 6,650 2,400 15,400 2,300 
1957 16,950 8,400 2,400 17 ,600 2,700 
1958 25 ,000 8 ,600 2,400 19,550 3,750 
1959 24,250 10 ,650 2,400 23 ,750 4,250 
1960 22,500 11,350 2,400 27 ,900 4,600 
1961* 31,000 12,500 2,400 29 ,400 4,400 























* Projected. 
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TABLE III. GROWTH IN MEMBERSHIP, 1949-60 











Year Number of Members (at year-end) 
1949 4,324 
1950 4,237 
1951 4,356 
1952 4,655 
1953 4,900 
1954 5,150 
1955 5,536 
1956 5,691 
1957 5 ,667 
1958 6,325 
1959 7,026 
1960 7,515 








$102,250 is proposed for the income budget and $109,750 for expense, leaving a loss of 
$7,500. In the event that our situation brightens, some income items might increase 
thereby lessening the 1961 loss to a small extent. Unless it is felt that dues cannot be 
raised by the end of 1961, I do not believe it would be good policy to retrench enough in 
1961 to bring us into balance. As pointed out elsewhere, our staff has not grown in the 
past ten years. The increased services have come from the increasing efficiency of a staff 
that works well together. 

Almost all who attended the Annual Meeting, held this year in August 1960 at Stanford 
University, felt it was one of the best ever held. The campus meeting rooms, the dormitory 
accommodations, the special arrangements of Dean Bowker’s Committee, and the 
pleasant surroundings all contributed to its success. The program was excellent. However, 
the attendance of ASA members was perhaps a bit disappointing. The total of registrants 
was not inconsiderable: over 1,200 persons attended. However, only one-half that number 
(640) were members of ASA. One-sixth of the total were visitors, that is, not members 
of any one of the five societies meeting jointly. The fact that less than ten per cent of the 
the Association’s domestic members attended this convention raises doubts as to how 
soon another full scale west coast meeting should be scheduled. In spite of a very low 
room and board rate at Stanford (and the attractions of a vacation in California), a 
surprisingly small number made the trip from the eastern portion of the country. Just as 
disappointing was the fact that attendance from the west coast (outside of the San 
Francisco area) and other western states was rather poor. In view of these findings, it 
may be that regional conferences should be encouraged until we have a more equitable geo- 
graphic distribution of members between the east and west. On the other hand, western 
meetings are important for the rounded development of the Association. Steps have 
been taken during the past few years to secure active participation of elected officers from 
all of our geographic regions; meetings help this process still further. 

Other activities have been reported elsewhere in the Association’s publications. Perhaps 
the most important recent development has been the program for closer cooperation 
among societies whose interest includes statistics in some important way. The entire 
field is growing rapidly in importance and in recognition and I feel proud of the leadership 
the American Statistical Association has been taking. 


Dona.p C. RItey 
Secretary-Treasurer 








AUDITORS’ REPORT 
AMERICAN STATISTICAL ASSOCIATION 
DeEcEMBER 31, 1960 


ALEXANDER GRANT & COMPANY 
Crertiriep Pustic ACCOUNTANTS 
910 SEVENTEENTH STREET, N.W. 

WasuinerTon 6, D.C. 


Board of Directors 
American Statistical Association 
Washington, D. C. 

We have examined the balance sheet of the American SratisTicAL ASSOCIATION (a 
non-profit organization) as of December 31, 1960, and the related statements of income 
and association equity for the year then ended. Our examination was made in accordance 
with generally accepted auditing standards, and accordingly included such tests of the 
accounting records and such other auditing procedures as we considered necessary in the 
circumstances. 

In our opinion, the accompanying balance sheet and statements of income and associa- 
tion equity present fairly the financial position of the American Statistical Association at 
December 31, 1960, and the results of its operations for the year then ended, in conformity 
with generally accepted accounting principles applied on a basis consistent with that of 
the preceding year. 

Comments in regard to the scope of our examination and details of certain items 
shown in the balance sheet and statements of income and association equity are presented 
in the following paragraphs. 


BALANCE SHEET CoMMENTS 
Cash 
Balances comprising the total cash of $32,574 at December 31, 1960 are as follows: 
American Security & Trust Company 


NN EEO OT ECE TEE ET CCPC ORT I $32,540 
ge TCL PER UE TEER TEL EET eT ee Ee ee Pe 14 
Ds ciadicers Sn eas dnd h Ca ae Cos PREPS he CRT ESP RaEE Cs conde inne 20 


$32,574 


Cash in banks and funds on deposit with savings and loan institutions were confirmed 
directly with the depositories. At December 31, 1960, cash in the amount of $6,133 was 
restricted for the unexpended balances of grants (see grant account analysis on page 480) 
and by the amount received from the Business and Economic Section of A.S.A. to be 
applied against the cost of future regional meetings. 


Accounts receivable 
The following items make up the accounts receivable total of $10,784 at December 31, 
1960: 


House Subcommittee on Legislative Oversight................000ce ee eecee $ 2,191 
Re PI <5 cca s set cesses ace te rcceebcataedae tiene 5,000 
Dividends on Mutual Funds held by broker............... 0.00 cee eeeeeees 714 
I ote Ra 8 eee kn ZEA 6d Wala Aa des: <ccelbeanaeaguaelenen 336 
Pes MPN aah 6. SAdsits Lit ahs. er newenee te Oeeis Kua noaes eaten 216 
IN ooo. oo ec ck bos oddelacawcecé nswakeataweee 210 
OUT Gna 5.6 53.509 40:0 5 tee CARDO S 3h ose ae cereus 1,931 
SAN 5 9c a5 owe Cine meth 6 ase se ble aOR ARS | da ad REESE 186 

$10,784 


aaa 


479 











480 AMERICAN STATISTICAL ASSOCIATION JOURNAL, JUNE 1961 


Accounts payable—trade 


At December 31, 1960 accounts payable—trade of $22,707 included the following: 
Printing and mailing costs related to December 1960 Journal 
Printing and mailing costs related to the October and December 1960 American 
Statistician 
Costs related to the Technical Committee on Broadcast Ratings 


$22 ,707 


Unexpended grants 


During 1960 the Association was the recipient of a grant from The Rockefeller Founda- 
tion. The grant funds are to be used toward the cost of convening representatives of 
professional societies interested in the use of statistical techniques to explore means of 
obtaining greater coordination among them. The total amount of the grant is approx- 
imately $9,000, of which $3,000 was received in March 1960. The funds are to be used over 
a period of three years, at the end of which time any unexpended balance will revert to 
The Rockefeller Foundation. 


An analysis of all grant accounts for 1960 follows: 
The 
National Rocke- 
Science feller 
Foun- Foun- 
dation dation 
Grant Grant 
Unexpended balance—January 1, 1960.................. $12,800 
Received in 1960 — 
ee RE i pte taceel 8,593 





Unexpended balance—December 31, 1960 $ 4,207 $1,115 








INCOME AND EXPENSE COMMENTS 
Income 


Total income for the year 1960 was $6,528 higher than that for the preceding year. 
The major contributing factors to this increase were a larger number of paid-members 
and an increase in Journal advertising, subscriptions and sales. 


Expenses 


The increase of $15,195 in 1960 expenses as compared with 1959 was due mainly to 
increases in salaries of $4,189 and publications’ expenses of $4,603. 


GENERAL 


During the course of our examination certain deficiencies in the internal control and 
related operating procedures were noted. Comments in regard to our findings and recom- 
mendations are reported upon in a supplemental letter. (The letter follows.) 


ALEXANDER GRANT & CoMPANY 
Washington, D. C. 


March 24, 1961 
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May 11, 1961 
Board of Directors 


American Statistical Association 
Washington, D. C. 


We have recently completed our audit of the American Statistical Association for the 
year ended December 31, 1960. We found the records and financial procedures to be gen- 
erally satisfactory. However, in certain areas we noted deficiencies in the system of internal 
control and in the related operating procedures. Our findings and recommendations are 
presented in the ensuing paragraphs for your consideration. 


PoLicy AND PROCEDURES 
Findings 
While the operating procedures of the Association are generally efficient, there is no 
formal policy guide or procedural outline for guidance of employees in carrying on the 
activities of the Association at its national office headquarters. In this regard, it appears 
that only two individuals were familiar with the over-all policies of the Association. 


Recommendations 


To facilitate executive direction; to provide a valuable training aid for new employees; 
to make certain that no voids in internal control procedure exist; and to insure that the 
Association’s activities would not be seriously curtailed by the absence or incapacity of 
the two individuals presently familiar with policy and operating procedures, we recom- 
mend that further action be taken on the program to develop the Association’s policy and 
operations manual. 

To date, the work on the operations manual has consisted of developing a topic outline 
covering organization structure and the major phases of national office operations. The 
general outline of the manual is as follows: 


. Organizational Structure and Policy 
. General Accounting Records 
. Budget 
. Cash Receipts and Disbursements Procedures 
. Income From Dues, Subscriptions, Advertising and Sales 
3. Maintenance of Membership and Subscription Files 
. Annual Meeiing 
. Investments 
9. Personnel Manual 
Casu ReEcEIPtTs 
Findings 
While improvement within the past two years has been noted in processing of cash re- 
ceipts for members’ dues, subscriptions and sales, a serious problem still exists at peak 
periods following the billing of membership dues. During December 1960 and January 
1961, cash receipts aggregating several thousand dollars, were left in an unprotected state 
in the Association’s office for from one to two weeks. This presents a serious problem with 
respect to loss due to fire or theft. The delay in the processing of cash receipts at the end of 
1960 necessitated rescheduling the audit and a consequent delay of several weeks in our 
submission of the audit report. 


Recommendations 


One of the basic principles of good internal control is that cash receipts be deposited in- 
tact on a daily basis. A plan for processing receipts during peak periods should be formu- 
lated and put into effect. 

TRAVEL EXPENSES 
Findings 

In our examination of travel expenses, we noted that even though such expenditures 
were modest and substantially below the amounts budgeted, there were several instances 
where no expense reports were submitted. 
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According to Internal Revenue Service regulations, if travel expenses are not reported 
to the employer, in keeping with reasonable business practices, the advances must be re- 
ported by the employer to Internal Revenue Service on an annual information return. In 
addition, the employee has the obligation to report them in his individual income tax re- 
turn. 


Recommendations 

A formal written procedure should be instituted covering the accounting procedures for 
travel advances and expenses. A standard expense report form should be required to sub- 
stantiate all travel disbursements. 


PAYROLL 

Findings 

We noted that proper Federal income tax withholdings were not being made in all cases 
because of a misinterpretation of the pertinent regulations. We presume that the individ- 
uals concerned reported all income received from the Association in their personal returns 
and paid the required tax; however, the Association has the obligation to make the re- 
quired withholdings from all salaries. 

There were no formal authorizations in the files for retirement plan withholdings; and 
we could not locate a number of the W-4 forms for Federal income tax withholding exemp- 
tions. 


Recommendations 


Income tax exemption forms should be obtained for each employee and kept current, as 
well as signed authorizations for payroll deductions. 

Even though the staff is small, serious consideration should be given to preparing a for- 
mal personnel manual or at least a statement of policy covering such matters as vacations, 


holidays, overtime and sick pay, office hours, rest periods and absences. 


Morvat Funp INVESTMENTS 
Findings 
The regulations of the Association, based on the by-laws do not provide for proper ac- 
counting control over the mutual fund investment certificates aggregating $26,285 at 
December 31, 1960. The Secretary-Treasurer, although bonded, has sole responsibility for 
these assets. 


Recommendations 


We recommend that joint responsibility for mutual fund investments be assigned to two 
individuals, preferably the Secretary-Treasurer and an officer or director who is independ- 
ent of the accounting function. Since the investment in mutual funds represents a signifi- 
cant portion of the Association’s assets, we recommend that the Finance Committee ar- 
range to make a more frequent review of the Association’s investment portfolio. 


MINUTES 
Findings 
In our review of the Directors’ meeting minutes, we noted that some minutes were not 
signed or indexed. The Association does not maintain a formal minute book containing the 
proceedings of Directors’ meetings, although copies of minutes are distributed to Council 
members after each meeting. 


Recommendations 

The minutes of all directors’ meeting and other important committee meetings should 
be indexed, signed by the secretary and placed in a formal minute book to be maintained 
at the national office as a permanent record. 
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GENERAL 


The primary reason for the existence of most of the aforementioned deficiencies is that 
there is no established policy or procedure in the specific areas. Most of the activities at the 
national office are carried out under the verbal instructions of one or two individuals, and 
no provision has been made for their absence due to prolonged illness or termination of serv- 
ice with the Association. 

The Association has operated efficiently with a small staff over the years and has felt it 
necessary to economize wherever possible. We feel, however, that the recommendations 
we have made could be effected with very little additional cost. 

With the knowledge that the officers, employees and membership of the Association 
wish the internal accounting and procedural activities to keep pace with its rapidly ex- 
panding work in the field of promoting efficiency and new techniques in statistical matters, 
we present these comments and recommendations for your consideration. We shall be 
pleased to meet with you or answer any questions you may have relative to the matters 
outlined in this letter. 

Very truly yours, 
ALEXANDER Grant & CoMPANY 
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AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE BALANCE SHEET 
DEcEMBER 31, 1960 anv 1959 


December 31, Increase 
1960 1959 (decrease) 


32,574 $ 29,628 $ 2,946 
Investments (supplemental schedule on page 488) 
Funds on deposit in savings 
accounts and savings and loan institutions... . 26,871 51,628 (24,757) 
U. 8. Government bonds 29,581 17 ,667 11,914 
Investmert in stocks ameees 26 , 285 20 ,213 6,072 
Accrued interest receivable........ ‘ 816 (518) 
Accounts receivable............ i 7 10,277 507 
Inventories 
Old Journals ee ; 3,6 3,647 — 
Monograph of Kinsey Report..... aati ; 2,603 (72) 
el anna 24 (24) 
Prepaid expenses 2,311 4,808 





$138,814 876 


Fixed assets—at cost 
Furniture and fixtures Pr ee Or 7 8 ,026 (4,550) 
Office machines... a came ; ; 500 809 





$ 8,526 $ (3,741) 
Less accumulated depreciation... . YP Rae 6,448 (4,247) 





$ 2,078 





$142,274 $140,892 








The accompanying report letter is an integral part of this balance sheet. 


AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE BALANCE SHEET 
DecEeMBER 31, 1960 AND 1959 


December 31, Increase 
1960 1959 (decrease) 
Liabilities 
Accounts payable—trade...... ete oe . $ 22,707 $ 20,898 $ 1,809 
Chapter dues 

ae Oe aaah 1,333 1,159 174 
Waneewm, D.C... ......... Beh eae 724 620 104 
Philadelphia. . . 149 103 46 

Due to National Science Foundation (sales of the 
Index to the Journal) 5,333 1,364 3,969 
Subscriptions received for Technometrics........... 2,570 1,748 822 
Due to American Sociological Society........ ; 110 135 (25) 
Payroll taxes it.vae ge aa 623 387 236 
Accrued directory expense......... on 3,000 — 3,000 
119 (119) 





$36,549 $26,533 $10,016 





REPORT OF THE SECRETARY-TREASURER, 1960 485 


Deferred income 
Membership dues 26 ,359 21,691 4,668 
Subscriptions 
Journal 8,560 9,579 (1,019) 
American Statistician 376 403 (27) 
Proceedings 
Business and Economic Section _ 1,241 
—_— 585 
3,125 3,348 (223) 
467 467 — 





$ 40,713 $ 35,488 5,225 
Amount reserved for future regional meetings...... 811 811 — 
Unexpended grants 
National Science Foundation..... 4,207 12,800 (8 ,593) 
Travel Fund Grant fe - 7,319 (7,319) 
The Rockefeller Foundation. pe ieee 5 a 1,115 





$ 20,119 $ (14,797) 
Rs «5 cneden sss ¢d3e0baasusrisoaas 58 , 878 57,941 938 





$142,274 $140,892 $ 1,382 


The accompanying report letter is an integral part of this balance sheet. 


AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE STATEMENT OF INCOME AND EXPENSE 
Years ENDED DECEMBER 31, 1960 AND 1959 


Years ended 
December 31 Increase 
1960 1959 (decrease) 
Income 


Dues—-old members ... $ 52,384 $48,795 $ 3,589 
—new members 7,182 6,787 395 
Subscriptions—J ournal 16,288 15,908 380 
—American Statistician 633 26 7 
Advertising—-Journal 6,286 4,927 1,359 
—American Statistician 1,706 1,486 220 
Sales—Journal 5,690 2,811 2,879 
—American Statistician 86 113 (27) 
—Business and Economic Statistics Section Pro- 
ceedings 3,508 4,642 (1,134) 
—Social Statistics Section Proceedings........ 2,157 1,639 518 
—other sales 125 659 (534) 
Mail list income 1,688 2,441 (753) 
Interest income 2,346 2,789 (443) 
Miscellaneous 223 151 72 





Total income $100,302 $93,774 $ 6,528 
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Expenses 


Salaries....... wale Gl 6 Sg A , : 23,741 4,189 

Pension plan 7 , ‘ 909 121 
hile 6:6 ee 2,400 117 

Depreciation 360 

Supplies and office expense... . i 4,344 

Postage and delivery 3,335 2,339 

Telephone and telegraph 792 

Accounting services....... ph + Ae iiagteay : 993 

Clerical services............ se adie 4: 218 

Taxes and insurance............ Piecse 26 951 

Promotion activity. re as K 2,675 

Commit expense aes tira fae 76 861 

Secretary’s travel and expense NES eer 528 

Annual meeting—net....... SPE I 2 , 028 572 

Publications (see page 488). ... Beg pica 7 42,898 

i ese ' ” ‘Eat ee 798 





Total expenses...... ee $15,195 
Net Income (Expense)—Operations..... . rs . (8 ,667) 


Other Income 


Dividend income and changes in market value of mu- 
tual fund investments 1,023 


Net Income ide aay oe 938 2 $(7,644) 





The accompanying report letter is an integral part of this statement. 


AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE STATEMENT OF ASSOCIATION Equity 
Years EnpEp DeceMBER 31, 1960 anv 1959 


Years ended 
December 31 Increase 
1960 1959 (decrease) 
Association equity—beginning of year $ 57,941 $49,359 $ 8,582 
Net income for the year........... 938 8,582 (7 ,644) 





Association equity—end of year....... $ 58,879 $57,941 $ 938 








The accompanying report letter is an integral part of this statement. 
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SUPPLEMENTAL SCHEDULES 


AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE SCHEDULE OF ACTUAL INCOME AND EXPENSES WITH BuDGET 
Year EnpEpD DeceMBER 31, 1960 
Actual 
Year ended over 
December 31, 1960 (under) 
Actual Budget budget 
Income 


Dues—old members . $52,384 $53,000 $ (616) 
—new members........ 7,182 7,000 182 
Subscriptions—Journal..... 16,288 17 ,000 (712) 
—American Siatistician. ' 633 650 (17) 
Advertising—Journal... . anf. one eas 286 6 ,000 286 
—American Statistician 706 1,800 (94) 
Sales—Journal 3,000 2,690 
—Social Statistics Section Proceedings. . ss : 2,000 157 
—Business and Economic Statistics Section Pee 
ceedings re : 4,000 (492) 
—other......... 500 (289) 
Mail list income 2,200 (512) 
Interest income * 2,700 (354) 
Annual meeting—net (expense) : (1,000) (1,029) 
Miscellaneous. ..... Sees eee: — 1,433 





$99,483 $98,850 $ 633 
Expenses 


Salaries : a «aan ebeatiag aad 27,930 24,700 3,230 

Pension plan....... RNP SCL 1,030 1,100 (70) 

Publications journal printing by os! vise 22,483 28,000 (5,517) 
—Journal editorial sai 3,663 2,800 863 
—American Statistician 11,360 10,000 1,360 
—Directory ‘ 3,000 3,000 —_ 
—Social Statistics Section Proceedings. . 2,342 2,000 342 
—Business and Economic Statistics Sec- 

tion Proceedings... see 

NE. o< > oaks ees e's 

Promotional activity. . 


3,500 57 

3,000 515 
3,000 
Secretary's travel and expense..................+. 3,000 
Officers’ travel 4 pee 2,500 
Supplies and office expense. sp se 4,634 4,000 
Postage and shipping charges. . 3,335 2,600 
Telephone and telegraph... . ra SON 1,488 1,200 
Accounting services eye: ), - 1,150 1,000 
Committee expenses...... me Sg ate 376 1,000 
Miscellaneous—depreciation _ 425 400 
976 700 
285 300 
1,102 300 


~J 


Ota om 


— bo we OO 
cs 
CoN Ke eS OO 

“101 @ 


Na > 





$98,545 $98,100 





Net Income. i Git hnaaatl t A $ 938 $ 750 
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AMERICAN STATISTICAL ASSOCIATION 
CoMPARATIVE SCHEDULE OF PUBLICATIONS’ EXPENSES 
Years ENpED DecEMBER 31, 1960 AND 1959 


Year ended 

December 31, Increase 

1960 1959 (decrease) 

Journal—printing............ $22,483 $22,649 $ (166) 
—editorial.......... , 3,663 2,703 960 
American Statistician . ee 11,360 10,631 729 
Membership Directory. — : 3,000 717 2,283 
Business and Economic Statistics Section Proceedings. 3,557 3,799 (242) 
Social Statistics Section Proceedings... . . ee 2,342 1,900 442 
Journal reprints—net...... 715 499 216 
GE os. Datae as src 381 — 381 





$47,501 $42,898 $4,603 





AMERICAN STATISTICAL ASSOCIATION 
INVESTMENTS 
DEcEMBER 31, 1960 
Income 
Cost or (loss) 
amount for year 
invested ended Effec- 
at Quoted Decem- tive 
December market ber 31, rate of 
31, 1960 value 1960 = interest 
Savings Accounts 
City Federal Savings and Loan Association. . $ 5,000 $ 400 
Fidelity Federal Savings and Loan Association. . 5,298 323 
Hyattsville Building Association... . 1,288 50 
Jefferson Federal Savings and Loan Association. 874 34 
Liberty Savings and Loan Association. ..... 4,209 184 
Piedmont Federal Savings and Loan Ansociation 5,202 202 
Trans-Bay Federal Savings and Loan Associa- 
5,000 


$26 ,871 


’. S. Government Bonds 
U.S. Treasury 43% notes, due November 1964.. $15,014 $15,870 
U. S. Treasury 23% bonds, $10,000 face amount, 

due November 1961. ae ; 9 ,567 9,921 
Federal Land Bank 4% bonds, due May 1962 2. 5,000 5,014 





$29,581 $30,805 








Investments in Stocks 
Tri-Continental Corporation—200 shares. .. . ,053 $7,600 $ 49 
Niagara Shares Corporation—300 shares... . ' (165) 
Consolidated Investment Trust—300 shares... . y77 120 
Massachusetts Investors Growth Stock Fund— 
375 shares... : 6,947 1,206 





$26,285 $1,210 
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In addition to the income listed in this schedule, $300 was received from deposits in 
savings accounts which were closed during the year, and $39 was received from U. 8. 
Savings Bonds sold during the year. Interest earned on savings accounts deposits and on 
U. 8. Government Bonds during 1960 totaled $2,845. Of this amount, $2,346 is reflected 
as income of the current year. The remaining $499 was credited to the Travel Fund Grant 
since it related to interest on grant funds. 


AUDITORS’ REPORT 
TECHNOMETRICS 
DeEcEMBER 31, 1960 


ALEXANDER GRANT & COMPANY 
CERTIFIED PuBLIc ACCOUNTANTS 
910 SEVENTEENTH StrReEEt, N.W. 
WasuHinaton 6, D. C. 
Management Committee 
Technometrics 
Washington, D. C. 


We have examined the balance sheet of Technometrics (a non-profit organization) as of 
December 31, 1960, and the related statement of income and equity for the year then 
ended. Our examination was made in accordance with generally accepted auditing stand- 
ards, and accordingly included such tests of the accounting records and such other auditing 
procedures as we considered necessary in the circumstances. 

In our opinion, the accompanying balance sheet and statement of income and equity 
present fairly the financial position of Technometrics at December 31, 1960, and the results 
of its operations for the year then ended, in conformity with generally accepted accounting 
principles applied on a basis cor:sistent with that of the preceding year. 

Comments in regard to the scope of our examination and details of certain items shown 
in the balance sheet and statement of income and equity are presented in the following 
paragraphs. 


BALANCE SHEET COMMENTS 
Cash 


Cash at December 31, 1960, in the amount of $3,234 was confirmed directly with the 
depository, American Security and Trust Company. 


Accounts Receivable 


The following items make up the accounts receivable total of $2,616 at December 31, 

1960: 
Unremitted subscriptions from the American Statistical Association.......... $2,570 
46 


$2 ,616 


Accounts payable 


At December 31, 1960, accounts payable aggregating $5,651 included amounts due to 
the following: 
William Byrd Press—(cost of printing the November 1960 issue of Technometrics 
and reprints) 
University of Wisconsin 1,324 
American Statistical Association 1,191 
50 


$5,651 
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INcoME AND ExPpENSE CoMMENTS 
Income 


The $3,293 increase in total income for the year ended December 31, 1960, as compared 
to the year ended December 31, 1959 was due to an increase in subscriptions paid for 
Technometrics. During 1960, no donations were received from spossoring organizations 
nor was any income derived from advertising. 


Expenses 


Total expenses for the year ended December 31, 1960, increased $2,939 as compared 
to the previous year. This consisted principally of the increased cost of printing the four 
issues of Technometrics. During the year, the American Statistical Association increased 
its charges for various office and administrative costs incurred incident to processing sub- 
scriptions and preparing mailing list addressograph plates for 7'echnometrics. In 1959, the 
the American Statistical Association made only a nominal charge for these services, with 
the understanding that the amount be increased in 1960. 

ALEXANDER GRANT & CoMPANY 
Washington, D. C. 
March 24, 1961 


TECHNOMETRICS 
ComPpaRATIVE BALANCE SHEET 
DecemMBER 31, 1960 AND 1959 


Assets 
Years ended 
December 31 Increase 
1960 1959 (decrease) 
$ 3,234 $ 8,284 $(5,050) 
Savings accounts...... aod Hp , 17 ,284 7,070 10,214 
Accounts receivable.......... RS 2,616 4,567 (1,951) 
Accrued interest receivable... .. wad sang arg te 223 71 152 





$23,357 $19,992 $3,365 








Accounts payable j $4,256 $1,395 
Deferred subscriptions. ical y 4,860 
Advances from sponsoring organizations 
American Statistical Association LR 5,000 5,000 
American Society for Quality Control.......... , 5,000 5,000 
; 2,106 876 





$23,357 $19,992 


The accompanying report letter is an integral part of this balance sheet. 
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TECHNOMETRICS 
CoMPARATIVE STATEMENT OF INCOME AND Equity 
Years EnpEp DEcEMBER 31, 1960 anv 1959 


Years ended 
December 31 Increase 
1960 1959 (decrease) 
Income 


Subscriptions 

Members of A.S.A. and A.8.Q.C..... $8,758 $6,892 $ 1,866 

Non-members be 3 nA 8,664 6,908 1,756 
Sales 

Single copies...... > A en seapeaeg 1,522 104 1,418 

Reprints—net......... (425) 151 (576) 
Interest 573 141 432 
Donations from sponsoring organizations........... - 1,500 (1,500) 
Advertising.......... a 150 (150) 
Miscellaneous 49 2 47 





Total income..... seg ..... $19,141 $15,848 $3,293 


Expenses 


Publication expenses 
Editorial as + aaa 1,399 1,721 (322) 
DES. cok tae ss x: Sp Shree: 12,825 10,239 2,586 
Postage and mailing Rpt iy 1,092 578 514 
Maintenance of subscription files............... 890 Sil 379 
Promotion sli ae 124 1,603 (1,479) 
General and administrative 
PE TI 6 9a5 56. 6 oe a peewee 3% Sylemedh « 320 220 100 
228 60 168 
Overhead expenses billed by A.S.A.............. 974 — 974 
Miscellaneous EWS a) oop a 59 40 19 





Tote CRON iw , $14,972 $ 2,939 





Net Income $ 876 $ 3654 
Equiiy—Beginning of Year...... —_ 876 





Equity—End of Year $2,106 $ 876 $1,230 





The accompanying report letter is an integral part of this statement. 


TecuNometrics Savines Accounts DecEMBER 31, 1960 


Income 
Amount for year 
invested at ended Effective 
December 31, December 31, rate of 
1960 1960 interest 


Riggs National Bank . $$ 7,284 $197 
National Security Savings and Loan Association 5,000 188 
Home Foundation Savings and Loan Association 5,000 _188 

$17 ,284 $573 





EDITORIAL COLLABORATORS 


(continued from page vi) 
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University 
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sity 
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burgh 
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HERMAN Rustin, Michigan State University 
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sily 
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Statistics 

Victor E. Smrru, Michigan State University 

R. Cuiay Sprow.s, University of California, 
Los Angeles 

J. H. StapLteton, Michigan State University 

Cuarues E. Swanson, The Curtis Publish- 
ing Company 

Lorie Tarsuis, Stanford University 
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Joun Tuxey, Princeton University 

ANDREW Vazsony1, Ramo-Wooldridge 

D. F. Voraw, Jr., Yale University 

Davin L. Watiace, University of Chicago 

Joun E. Wausu, System Development Cor- 
poration 

Martin B. Witx, Rutgers University 

Lorine Woop, United States Bureau of the 
Census 

Max Woopsoury, New York University 

TueEoporeE D. Wootsey, United States Na- 
tional Health Survey 

Tuomas A. Yancey, University of Illinois 
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sity 
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ards 

ARNOLD ZELLNER, Netherlands School of 
Economics 
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UNIVERSITY OF 


oes 


PRESS 

















The first two volumes in the Statistical Research Monographs 
series sponsored by the .astitute of Mathematical Statistics 
and the University of Chicago 


The Passage Problem for a 
Stationary Markov Chain 


By J. H. B. Kemperman. Presents systematically a number of meth- 
ods useful in studying the problems of oot pene and absorption 
in a Markov chain; in particular, meth for obtaining exact 
formulae for the probabilities under consideration or their moments. 
Numerous illustrations show how each method serves as a natural 


tool for handling many practical problems, 127 pages. $5.00 











Statistical Inference 


for Markov Processes 


By Patrick Billingsley. A general mathematical theory for the sta- 
tistical problems of determining whether mathematical models fit 
empirical data and of estimating any parameters upon which the 
models may depend. The applications which illustrate the mathe- 
matical results make the book useful to workers in the applied 
fields as well as to mathematicians, statisticians, and graduate stu- 
dents in statistics. 75 pages. $4.00 


. . and other titles of interest 


Modern Factor Analysis 


By Harry H. Harman. Designed to serve the interests of graduate 
students and researchers in statistics, psychology, and related disci- 
plines, this study presents an accurate, up-to-date account of factor 
analysis from its basic foundations through the latest and most ad- 
vanced methods, including the use of high-speed electronic com- 
puters. 1960. 486 pages. $10.00 















An Introduction to the Theory 
of Experimental Design 


By D. J. Finney. A book for the mathematician, statistician, and 
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